This week, at the request of
Sm03leBr00t, I dabbled in something a little different, linear regression and python. I really enjoyed this chance to be able to work with machine learning and python again. This project was simple enough, I think I used around 50 lines of code.
So to start this off you need three things matplotlib and pandas at first; and then sklearn a little later.
import pandas as pd
import matplotlib.pyplot as mplot
I had a lot of issues getting all of these things to work on my wsl Ubuntu subsystem. So I finally broke down and downloaded anaconda and spyder. I did this because even upon getting these dependencies to work I still could not display the graphs I generated in this program. So I will say this now if you plan on using python for machine learning and you are using wsl; get something similar to anaconda and spyder to implement these packages.
After setting up anaconda and spyder I created a simple CSV file to read into my program using panda. You can create a simple spreadsheet on google sheets or excel then save it as a CSV file.
file_name = input("Enter CSV file: ") #read in user input
dataset = pd.read_csv(file_name) #read in csv file
X = dataset.iloc[:, :-1] #data set of first col
y = dataset.iloc[:, 1] #data of second col
print(dataset)
After this I passed the data above into the sklearn machine learing package:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=1/3, random_state=0)
#X_train contians 1st col
#Y_train contains 2nd col
The train test split method will split the CSV data into a train and test matrix. The test_size parameter gives the size of the test matrix compared to the actual data. The random state will add random data to the matrices I did not want this so I set it to zero.
After completing this I needed to train the machine for linear regression, as shown below:
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)
The code above imports LinearRegression from sklearn and then trains the machine with the training data from the last step.
With all of this set up the only thing left is to setups the graphs and allow the user to input what data they want.
#get graph names/ xy name
graph_name = input("Enter ScatterPlot name: ")
x_name = input("Enter Name of X axis: ")
y_name = input("Enter Name of Y axis: ")
# Visualizing the Training set results
mplot.scatter(X_train, y_train, color='red') #create datapoint scatter plot
mplot.plot(X_train, regressor.predict(X_train), color='blue') #create linear line through pts
mplot.title('{} (Training set)'.format(graph_name)) #give graph a title
mplot.xlabel(x_name) #x axis name
mplot.ylabel(y_name) # y axis name
mplot.show() #print out training data graph
# Visualizing the Test set results
mplot.scatter(X_test, y_test, color='red') #create test datapoint scatter plot
mplot.plot(X_train, regressor.predict(X_train), color='blue') #create linear line through pts
mplot.title('{} (Test set)'.format(graph_name)) #give test graph a title
mplot.xlabel(x_name) #x axis name
mplot.ylabel(y_name) #y axis name
mplot.show() #print out test data graph
With the two graphs created, I then created a loop to allow the user to input and get data from the Machine
choice = "y"
while(choice == "y" or choice == "Y"): #while user wants to enter data loop
regre_val = float(input("What {} Would You like to find: ".format(x_name))) #get value from user and convert to a float
y_pred = regressor.predict([[regre_val]])[0] #send value to AI to get result
print("Here is your {} {:.2f}".format(y_name, y_pred)) #print out result
choice = input("Would You like to predict more values(y/n): ") #user input to try again
print("Thank You")
This project really opened my eyes to data science and I appreciate the suggestion Sm03leBr00t. I will definitely work on something similar to this in the future.
Github:Repo
Top comments (0)