Evaluation Toolkit
Intrinsic
We present an evaluation toolkit in evaluation folder.
Run the following codes to evaluate your trained dense vectors on intrinsic tasks.
$ python ana_eval_dense.py -v <vector.txt> -a CA8/morphological.txt
$ python ana_eval_dense.py -v <vector.txt> -a CA8/semantic.txt
Run the following codes to evaluate your sparse vectors.
$ python ana_eval_sparse.py -v <vector.txt> -a CA8/morphological.txt
$ python ana_eval_sparse.py -v <vector.txt> -a CA8/semantic.txt
Extrinsic
Text Classification Book Review dataset contains 20,000 positive reviews and 20,000 negative reviews collected from https://book.douban.com/. Each review has a star tag rated by users from one star to five stars. We identify one-star and two-star reviews as negative, four-star and five-star reviews as positive.
Name Entity Recognition Financial NER is a dataset including 3000 financial news articles manually labeled with over 65,000 name entities (people, location and organization).
They are available at extrinsic_eval_data.zip in page