INTRODUCTION
Sometimes, beginner or intermmediate data scientist lost in their notebooks finding the cells and results of the analysis they done, due this they waste most of their figure out their code and remembering the results.
Here are some steps in which you can solve these problems and save your lot of your time
The steps are as follows.
1. Create two folders for a project
For a project, you need to create two folders. One is for raw data and other is for processed data. The raw data file folder contains raw files (or orginal data) given to you. The processed data contains files that contains the data that you preprocessed using a pipeline or any other method.
It makes you easy for working with data and also you don't need to preprocess the data again and again in the notebooks.
2. Create separate notebooks for differernt tasks
For large projects, you cannot work within a single notebook. You need to create two or more notebooks. Otherwise, you face many problems like your kernel stop working pproerly and your cells will run slowly and you will lost your precious time.
So, in order to avoid this create two or more separate notebook. Save each notebook with their name with the work they supposed to do.
3. Use Markdown
For data scientists explaination of their code and results of their analysis is very important. So, use markdown for explaining your code and note down the results.
You can markdown when
- You need to give introduction
- You want to write additional info about notebook
- You want to write results of analysis
- You want to elaborate your notebook sturcture.
4. Use comments
Data scientist mostly need to explain their codes to others. So, use comments in the cell for explaining the different parts of your cell.
For an example, in a cell you create a function that predicts the output using machine learning model. In that case you can define little bit about input and output of that function. At last remember, you should not use comments unneccesoraily.
Read the full article
Top comments (0)