NLP Engineer

Alexandria Technology
 New York, NY

Company Description



Alexandria Technology uses machine learning technology to analyse and classify valuable information in unstructured textual data. Our technology was first developed to decode DNA, and now extracts information from financial news for investment professionals.

Alexandria was recently named one of the top 40 big data companies to work for by JP Morgan.

Job Description

The Natural Language Processing (NLP) Programmer will work directly with the Chief Scientist of Alexandria Technology to develop proprietary text classification software that uses machine learning technology to extract, analyze, and classify information from unstructured textual data for the financial services industry. Specifically, the NLP Programmer will:

1. Design, develop, and implement machine learning software to analyze large amounts of unstructured textual data;

2. Use programming languages such as Java or Python to analyze large textual datasets (gigabytes to terrabytes);

3. Use mathematics, statistics, and machine learning techniques to develop algorithms to extract information from textual data;

4. Test and enhance algorithms to measure the accuracy of classification software;

5. Remain up to date on industry research and new techniques in natural language processing (NLP) and conduct research to determine if new techniques improve Alexandria's NLP software;

6. Prototype new algorithms and conduct experiments to test the validity and effectiveness of the methodologies used;

7. Implement algorithms and modules into the production system (Our production systems are in the cloud server and use Linux operating systems).

8. Design highly available, fault-tolerant and streaming systems to handle real-time, voluminous text feeds (Low latency is required for these systems);

9. Review existing programming code to identify bugs/problematic code and fix them as required;

10. Use Python to create signals from information extracted from Alexandria software and run statistical regressions to evaluate and forecast signals' impact on stock market performance taking into account complex and non-linear cause-effect relationships;

11. Complete data analysis on Alexandria extracted information to share with clients and financial service companies.



1. Master's Degree or PHD in Computer Science, Data Science/Engineering, or related field of study is required

2. Minimum of 2 years work experience in:

- Prior experience in a data driven quantitative research environment,

- Solid background in algorithm, data structure, Natural Language Processing (NLP), Machine Learning (ML),

- Strong experience in programming languages such as C++, Java, Python or Matlab

- Experience must also include text analytics such as Named Entity Recognition (NER), Topic Modelling, Sentiment Classification and

- Experience in big data analysis tools – MapReduce, PySpark etc

- Working knowledge of Linux, MySQL

Additional Information


- Experience with common open-source libraries such as OpenNLP, Mallet, Weka, Gate, deep learning toolkits.

- Experience with scalable streaming and/or batch processing with big data frameworks.