Introduction machine-learning with python library Scikit-learn with example



 

The goal of the artificial intelligence branch of machine learning is to comprehend how humans learn and develop strategies to mimic that process. These techniques, which frequently fit into one of the three most prevalent learning categories, involve using data and algorithms to enhance performance on a given set of tasks:
 

  • Supervised learning: a type of machine learning that learns the relationship between input and output.
  • Unsupervised learning: Unsupervised learning is a type of machine learning where the algorithm is trained on unlabeled data, and the system tries to learn the patterns and the structure from the data without explicit guidance.  
  • Reinforcement learning: a method of machine learning wherein the software agent learns to perform certain actions in an environment which lead it to maximum reward.

Scikit-learn example:


Data processing is a vital step in the machine learning workflow because data from the real world is messy. It may contain:

  • Missing values,
  • Redundant values
  • Outliers
  • Errors
  • Noise

Below is example of ML


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sklearn
df =pd.read_csv(‘hiring.csv')
df.isnull().sum()

df[‘test_score(out of 10)'].fillna(df[‘test_score(out of 10)'].mean(),inplace=True)
df[‘experience'].fillna(0,inplace=True)

def stringToNum(word):
  dict={‘zero':0,'one':1,'five':5,'two' : 2,
        ‘seven':7, ‘three': 3 , ‘ten':10,'eleven':11,0:0}
  return dict[word]
 
df[‘experience']=df[‘experience'].apply(lambda  x: stringToNum(x))

x=df.iloc[:,:3]
y=df.iloc[:,-1]

from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.1,random_state=5)

from sklearn.linear_model import  LinearRegression
mymodel=LinearRegression()
mymodel.fit(x_train,y_train)


y_pred=mymodel.predict(x_test)
y=mymodel.predict([[5,8,7]])

import  pickle
pickle.dump(mymodel,open(“model.pkl”,”wb”))
 

This is list  Model & Description


1    Linear Regression
The association between a dependent variable (Y) and a specific collection of independent variables is studied using one of the best statistical models (X).

2    Logistic Regression
Contrary to what its name suggests, logistic regression is a classification algorithm. It estimates discrete values (0 or 1, yes/no, true/false) using a set of independent variables.

3    Ridge Regression
The regularisation method that carries out L2 regularisation is ridge regression or Tikhonov regularisation. Adding the penalty (shrinkage amount) equal to the square of the coefficients' magnitude alters the loss function.

4    Bayesian Ridge Regression
Using probability distributors rather than point estimates when designing linear regression, Bayesian regression enables a natural process to survive the absence of sufficient data or data with an uneven distribution.

5    LASSO
L1 regularisation is carried out using the regularisation method LASSO. Adding the penalty (shrinkage quantity) equal to the tally of the absolute values of the coefficients it alters the loss function.

6    Multi-task LASSO
It enables the joint fitting of numerous regression problems while requiring that the characteristics chosen for each regression issue, also known as a task, be the same. Sklearn offers a linear model called MultiTaskLasso that simultaneously estimates sparse coefficients for multiple regression problems. It was trained using a mixed L1 and L2-norm for regularisation.

7    Elastic-Net
The Lasso and Ridge regression methods' L1 and L2 penalties are combined linearly by the Elastic-Net regularised regression method. When there are several connected traits, it is helpful.

8    Multi-task Elastic-Net
It is an Elastic-Net model that allows fitting multiple regression problems jointly, enforcing the selected features to be the same for all the regression problems, also called tasks.

Bestseller No. 1
Pwshymi Printhead Printers Head Replacement for R1390 L1800 Printhead R390 R270 R1430 1400 for Home Office Printhead Replacement Part Officeproducts Componentes de electrodomésti
  • Function Test: Only printer printheads that have...
  • Stable Performance: With stable printing...
  • Durable ABS Material: Our printheads are made of...
  • Easy Installation: No complicated assembly...
  • Wide Compatibility: Our print head replacement is...
Bestseller No. 2
United States Travel Map Pin Board | USA Wall Map on Canvas (43 x 30) [office_product]
  • PIN YOUR ADVENTURES: Turn your travels into wall...
  • MADE FOR TRAVELERS: USA push pin travel map...
  • DISPLAY AS WALL ART: Becoming a focal point of any...
  • OUTSTANDING QUALITY: We guarantee the long-lasting...
  • INCLUDED: Every sustainable US map with pins comes...

Clustering :

Clustering is a type of unsupervised learning where the goal is to group similar data points together based on certain features or characteristics.

K-Means Clustering:

K-Means is a partitioning method that aims to partition n data points into k clusters.

Conclusion:

New
ABYstyle - Call of Duty Toiletry Bag Search and Destroy, Black, 26 x 14 x 8.5 cm, Handle on pencil case for easy carrying, Black, 26 x 14 x 8.5 cm, Handle on pencil case for easy carrying
  • 100% official
  • Very practical with multiple pockets
  • Handle on pencil case for easy carrying
  • Material: Polyester
  • Dimensions: 26 x 14 x 8.5 cm
New
1890 Wing Angel Goddess Hobo Morgan Coin Pendant - US Challenge Coin Liberty Eagle Novel Coin Adult Toy Funny Sexy Coin Lucky Coin Pendant Storage Bag for Festival Party
  • FUNNY COIN&BAG: You will get a coin and jewelry...
  • NOVELTY DESIGN: Perfect copy the original coins,...
  • LUCKY POUCH: The feel of the flannelette bag is...
  • SIZE: Fine quality and beautiful packing. Coin...
  • PERFECT GIFT: 1*Coin with Exquisite Jewelry Bag....
New
Panther red Fleece Beanie
  • German (Publication Language)

 These modules include models that you can use to identify patterns in your data, assessment metrics that you can use to gauge your model's performance, and preprocessing tools to assist you in getting your model ready to feed into a machine learning model.

Original Post>