DEV Community

Cover image for Space Mission ML Project - Part 1(EDA)
kainat Raisa
kainat Raisa

Posted on • Edited on

Space Mission ML Project - Part 1(EDA)

About the Dataset

The Kaggle Dataset today we'll be using for this machine learning project is the space mission launch dataset. This dataset is about the status, cost, rocket status, etc. of the Space missions launched by different space organizations over time.
You can download the dataset from the link below.
Link of the Kaggle Dataset: https://www.kaggle.com/datasets/sefercanapaydn/mission-launches

About the Project

The Machine Learning Project we are going to create today is about exploring how the space mission cost has changed over time, which space organization has seen the most success or failures over time etc., and will create a predictive model to predict the success/failure of the future space missions. We are going to use some amazingly useful Python libraries and Python to create meaningful stories from the data.

Unleashing the Magic from the Data Begins

Importing the Essential Libraries

In this part of the project, we'll be importing just the libraries we are going to use for the data analysis ( the ML libraries will be introduced in the 3rd part of the project).

Image description
( don't freak out right now, I'll explain everything about the libraries and when to use which one )

Accessing the Data

Image description
Here we have read the CSV Data file and restructured the data into a pandas DataFrame. Using the head() method we are trying to see the first 5 rows/entries of the dataset. This shows us what and how the datapoints are present in the dataframe.

Cleansing and Filtering the Data

Image description
We are dropping the two unnamed columns using the drop(inplace=True) (inplace=True represents that we are dropping the columns from the original dataframe) method as they are not providing much value to the dataset. Then we'll check the head() of the dataframe again just to ensure that the columns are no longer in the dataframe.

Image description
The info() method returns the non-null datapoints account and types of the datapoints.

Image description
the isna() method shows us whether there are null values or not. We usually drop the null values when there are very few of them but in this case, the Price column is carrying a large amount of null values so we'll handle the null entries differently.

Image description
here we can see from the info that the price column is carrying non-numeric values so we are converting the prices into numeric data.

Let's see what the Data is telling us

We have organized details about all of the missions each organization has launched. Let's see the first 10 entries here:
Image description

Here we are trying to see total number of missions of each organization and the organization name with the maximum missions :

Image description

Image description

Now we'll see how many missions have been successful, how many have failed, and how many have prelaunch and partial failures for each organization.

Image description

As we know a space mission's success or failure is a significant parameter in an organization's space research history so we need to know the name of the organization with the maximum successful missions.

Image description

To know more and all of the details about an organization we are trying to filter out the data by organizations. Enter a space organization's name and their space mission details will be shown. (here we have printed the Data for ISRO. Go and try it out for SpaceX)

Image description

Here we are trying to know the status of the rockets of each organization.

Image description

Top comments (0)