Big Data and Small Data
In the early 2000s, analyst Doug Laney defined Big Data with three characteristics called “the three Vs”: Volume, Speed, and Variety of Data. However, it has been shown that Big Data is not only science and technology, but that it responds to a strategic vision of the business . In short, in Big Data the volume of data is not as important as the knowledge it provides us and allows us to make better decisions and make better strategic movements.
Until the emergence of Big Data (Massive Data in Spanish),
Business Intelligence worked with what we now call Small Data . Today we are in
a position to differentiate them:
• Small Data
works with smaller volumes of data , while Big Data works since 2012 with
petabytes instead of Terabytes , since data is collected from sources as varied
as commercial transactions, Social Media and sensors in machines. There is talk
of Big Data from 4 or 5 terabytes, but as we have said in recent years we are
already talking about pentabytes.
• Small Data
works with processed and structured data and the management and analysis is
made from it , while Big Data manages and analyzes changing data practically in
real time.
• Small data
works with data from different sources, but always structured , while Big Data
works with varieties of multistructured data, not just numerical structured
data; but also unstructured from social networks, e-mail, videos, audios or
commercial transactions.
Small Data works with OLTP (Online Data Processing) and EDW
(Enterprise Data Warehouse) software for data management and analysis on DBMS
(Database Management Systems). The most used database management systems are
MySQL, Microsoft Access, SQL Server, FileMaker, Oracle, RDBMS, dBASE, Clipper
and FoxPro.
Big Data uses Data Warehouse that manages structured data
such as financial records, customer and sales data and combines it with Big
Data Systems that store unstructured data. In addition, it incorporates
emerging systems such as Hadoop , a free software framework prepared to work
with NoSQL Database Management systems (unstructured data) and incorporates
Stream Computing to integrate data in motion from different sources,
guaranteeing a response in milliseconds.
In short, if up to now our database systems were fed by
large volumes of structured data, the complexity that has meant that the data
comes from different platforms, added to the seasonality of the same and the
data entry peaks; It has required software that allows the management area of
the company to manage all that information in order to make better decisions
and adopt a correct strategy in an ever-changing business environment to which
it is necessary to react quickly.