Towards Smarter Data Scientists

Can we automate data science ?


Many frameworks and services strive to help data scientists find the best possible machine learning algorithms for their problem. Oftentimes, this focuses only on the optimization of the algorithm’s hyperparameters.


We believe that the true value of AutoML is in the creation of smart engineered features that are directly inspired from real-life use cases (fraud, churn, recommender systems,…).

That is what our Feature Factory project is all about. Its first outcome is EventAggregator, a Python library that automatically generates aggregated features from a dataset.

Examples of such generated features are frequency, recency and other statistical information of each event.


Data scientists don’t need to spend countless hours rewriting the same code to obtain the same engineered features anymore, they can now concentrate on what’s really matters, adding more layers to their deep learning network. Or really understanding what’s different about each dataset to make the best possible tailored features.