Machine Learning is a part of Artificial Intelligence, that involves algorithms that are trained using data and once they learn they can perform on new data to conduct classification and prediction tasks.
These classification and prediction tasks are, for example, to determine if a consumer will default on a loan payment, or if an employee will conduct a fraudulent transaction, or if a promotion will reach a sales increase higher than 25%, or define the best way to group customers (clustering) so unique groups can have their dedicated commercial strategy.
These models usually resolve a complex problem and use data to train the model so that the model can then predict the new data with a level of certainty.
How can we use Machine Learning Models?
-
Customer churn (loss) prediction
-
Dynamic pricing optimization.
-
Supply chain optimization (demand prediction, etc.).
-
Employee productivity and engagement.
-
Customized marketing & promotion campaigns.
-
Risk assessment & credit scoring.
-
Fraud detection.
-
Predictive maintenance.
-
Brand monitoring.
-
Customer sentiment analysis.
-
Order cancellation prediction.
There are tons of literature about this, but the bottom line is that there are 3 types of Machine Learning algorithms based on their functionality:
-
Classification. These algorithms can state the class of a record of data, based on the variables of the record and the trained algorithm. The class can be binary (for example Yes / No) or can have multiple options (for example a digit class would have 10 options or classes). Classification is important because many of today’s challenges are related to determine the class of a situation based on specific variables – for example determine if a product was produced with quality or not, or if a transaction is fraudulent or not.
-
Regression. These algorithms can provide a predictive equation that enables any set of inputs to generate an output and to provide the relative level of strength of each variable, as well as the accuracy of the overall prediction. These algorithms are ideal to predict a value, for example a future sales result.
-
Clustering. These algorithms can group elements together based on similarity, even if the similarity is not apparent to human beings. This similarity happens in a set that can involve many variables. There are many ways to group information together, however this is very useful to find similar groups that can receive unique value propositions best suited for the group characteristics. For example, marketing segmentation, threat assessments and many other applications.
There are more Machine Learning models, for example some models enable to identify the variables or features that are more relevant, or even to reduce the number of features (for example Principal Component Analysis). However, these are more useful to refine the previous 3 models for better results.
ADDITIONAL SERVICES
Machine Learning Models oftentimes require a lot of work on data, such as preprocessing and Feature Engineering. And Machine Learning model optimization can sometimes be a lengthy process that requires dedicated hardware for long sessions to attempt to achieve the highest possible results.
We offer many services along the whole process from defining the project or even prioritizing multiple projects down to data collection & labeling, model creation and optimization and model production.
Contact our sales team-
Goal – what are we trying to achieve and why?
-
Data availability & labeling – what data do I have available? Is that data enough to describe the goal? Do I have the data labeled or can I label the data so I can proceed with the training of the model? (labeled data is data that includes the output of the goal, for example it includes a 0 if a transaction is Ok and a 1 if the transaction was fraudulent).
-
Data hygiene & EDA. Is the data quality fit for purpose? What are the early conclusions driven by the exploratory data analysis? Should we include all features? Do we need to pre-process data (scale, transform, encode, etc.)?
-
Model creation. Code the model using one or more models, define the measurements and optimization criteria, as well as the optimization parameters to try to achieve the highest predictive results of the model.
-
Model production. Once we have a solid training model, create an application that uses this model for the required prediction. This is usually done by creating a simple website or application that requests the user for the variables, runs the model with the provided data and displays the prediction so a decision can be made.
-
Purpose relates to the objectives that the company has related to the business problem they want to resolve using the Machine Learning model. For example, the type of information they want to predict (e.g., will the promotion be successful or not, based on product and customer data, etc.).
-
Sources include all the places where the scraper needs to look for information, be it video files, websites, documents or otherwise. Also, if these are fixed, or variable and how they vary – for example how the name varies over time.
-
Data relates to the available data, labeling of the data, quality of the data and preprocessing of the data so it can be used for modeling.
-
Output relates to what the prediction will be (for example 0 for promotion to be “failure” and 1 to be “successful”), as well as the performance indicators used for the success criteria (for example, Recall, Precision, F1, Accuracy, etc.).
-
Platform relates to the requirements of the application that will put the Machine Learning model into production (once trained).
-
Deployment relates to how to deploy and maintain this Machine Learning model solution within the company and circle back to the Purpose.
A: It depends on the problem to be solved, the data characteristics, the performance metrics, the optimization criteria, and other factors. Sometimes a complex problem can be solved with a Linear Regression model with a very high level of accuracy, while other times an apparently simple problem will require a complex ensemble with multiple rounds (maybe even days) of training cycles to achieve a modest performance. Every problem and model are unique.
A: It depends. Every problem and model are unique. But as a rule, XGBoost models tend to score higher for classification projects, and K-Means or K-Medians are usually very strong models for clustering projects. But let me repeat, every project is unique and often times a model as simple as a Regression or a Decision Tree can provide surprisingly high results for specific applications or can be excellent to translate into a simple coding production of the model.
A: It depends on the required support – if it is just the modeling portion assuming that the data is fully ready it can be very inexpensive. Most projects consume most resources during the data collection, labeling, hygiene, and pre-processing phases (thus these phases tend to be the bigger portion of the cost). In general, Machine Learning modeling is not an expensive artificial intelligence activity.
A: This is a difficult question to answer. Most projects use 80% of the time and resource to produce quality data for the modeling part. So purely the modeling aspect is fast and straightforward, even considering complex models that requires multiple hours (or days) of optimization. However the data collection and preparation part can be very time consuming, it does depend on how capable the company is to produce quality data to be used in the analysis and pre-processing phases. These can take days up to months.
A: SentientInfo has tremendous experience with Machine Learning model projects of various kinds, many kinds of applications in multiple functions, etc. In all cases we code everything ourselves. Another benefit is that we have been users of such models in large corporations and understand the complexities of working with large companies, manage multiple stakeholders and deal with corporate politics to achieve superior results. So, in summary: we know our stuff, we are lean, and we know how to do it technically and teamwork with corporation teams including adapting to local culture to achieve superior results.