At FactSet, we're working to be the best financial data provider on earth. To get there, we need highly motivated, talented individuals who are empowered to find answers through creative technology.
As a Software Engineer in Content Engineering, you will be part of our Digital Transformation, a mission to automate our data acquisition, quality assurance, content creation and analytics in a scalable cloud environment.
With the guidance of financial experts, you will leverage these large data sets to improve the quality and extend the scope of FactSet's existing and next generation products.
You will also take part in an agile and collaborative environment. Responsibilities Ingest and analyze various data sources to drive innovation in content creation.
Automate the acquisition, relevance scoring and storage of incoming sources. Explore and evaluate new data technologies to build a scalable, cloud oriented data platform.
Optimize data retrieval, develop dashboards and other visualizations for financial experts. Develop data-driven solutions leveraging machine learning concepts.
Develop solutions for extracting necessary information from unstructured texts. Develop processes for evaluating and improving the data quality and accuracy in content creation.
Understand, leverage, and improve the in-house data pipeline centered on Digital Transformation. Essential Requirements Experience in data engineering or business intelligence or business analytics.
With at least senior levelsoftware development experience(4+ years) A successful history of manipulating, processing and extracting value from large disconnected datasets.
Strong analytic skills related to working with unstructured datasets. Strong experience and proficiency with Python, Pandas, Scikit-learn, Numpy, NLTK, and Tensorflow.
Understanding of statistical and machine learning algorithms and concepts. Ability to grasp new concepts and technologies quickly and efficiently.
Desirable Requirements Experience in performing exploratory data analysis and design of experiments. Experience with open source NLP libraries e.
g., Spacy and Stanford CoreNLP. Experience working on cloud environments and services, preferably AWS e.g., Amazon Kinesis, Redshift, EC2, S3 Experience building and optimizing big data’ data pipelines, architectures and data sets.
Experience with modern data platforms such as Spark, Hadoop or other map / reduce big data systems and services. Experience with a variety of data stores such as ArangoDB, MongoDB, Cassandra, HBase, DynamoDB.
Experience in applying different Site Reliability Engineering and DevOps practices in ensuring quality and productivity.