Evaluation Toolkit

Intrinsic

We present an evaluation toolkit in evaluation folder.

Run the following codes to evaluate your trained dense vectors on intrinsic tasks.

$ python ana_eval_dense.py -v <vector.txt> -a CA8/morphological.txt
$ python ana_eval_dense.py -v <vector.txt> -a CA8/semantic.txt

Run the following codes to evaluate your sparse vectors.

$ python ana_eval_sparse.py -v <vector.txt> -a CA8/morphological.txt
$ python ana_eval_sparse.py -v <vector.txt> -a CA8/semantic.txt

Extrinsic

Text Classification Book Review dataset contains 20,000 positive reviews and 20,000 negative reviews collected from https://book.douban.com/. Each review has a star tag rated by users from one star to five stars. We identify one-star and two-star reviews as negative, four-star and five-star reviews as positive.

Name Entity Recognition Financial NER is a dataset including 3000 financial news articles manually labeled with over 65,000 name entities (people, location and organization).

They are available at extrinsic_eval_data.zip in page