For further actions, you may consider blocking this person and/or reporting abuse
Read next
Python Day-26 List comprehension-Exercises
Guru prasanna -
How Pod Creation Happens in Kubernetes? Understand Full K8s Workflow
Saumya Shah -
Understanding Lambda, Map, and Filter in Python
Carlos Armando Moreira -
The Future of Rust Programming and My Experience with Rust-Based Tools
arjun -
Top comments (4)
I may be mistaking but I think that you can say you are doing 'Big Data' whenever you have to process more data than your computer can handle.
When I did some researched on it I was amused to see that, for example, a huge CSV file to treat could be considered 'Big Data' if you are working on a veeeery old computer ๐
Okay interesting, so big is relative to your ability to process it. Does that refer specifically to compute instances from cloud providers, or could it be, like, a laptop?
Yes ! I found it quiet funny
And yes, from what I understand it depends on where you are performing your treatment. So, if you're using the cloud then it's relative to the servers you're using
Big data can be described by the following characteristics
Volume
The quantity of generated and stored data. The size of the data determines the value and potential insight, and whether it can be considered big data or not.
Variety
The type and nature of the data. This helps people who analyze it to effectively use the resulting insight. Big data draws from text, images, audio, video; plus it completes missing pieces through data fusion.
Velocity
In this context, the speed at which the data is generated and processed to meet the demands and challenges that lie in the path of growth and development. Big data is often available in real-time. Compared to small data, big data are produced more continually. Two kinds of velocity related to big data are the frequency of generation and the frequency of handling, recording, and publishing.
Veracity
It is the extended definition for big data, which refers to the data quality and the data value. The data quality of captured data can vary greatly, affecting the accurate analysis.
Data must be processed with advanced tools (analytics and algorithms) to reveal meaningful information. For example, to manage a factory one must consider both visible and invisible issues with various components. Information generation algorithms must detect and address invisible issues such as machine degradation, component wear, etc. on the factory floor.
Source: wikipedia