Authored by: Mark Dildilian, Dir. Marketing and Business Development
Contributor: Krishman Senthil, SITEK Programmer
Big Data is a relatively new term that represents very large datasets that are so large and complex that conventional or long-established data processing applications are inadequate. Today challenges in managing Big Data from an IT/technology perspective include new models that are designed to analyze, capture, curate, search, share, store, transfer, visualize, query and let's not forget to include information privacy within a secure framework.
As technology, Big Data and management merge newer capabilities will drive new uses for data applications and present new opportunities for government, business, and other organizations. As these new advancements take place a primary goal or objective for Big Data management will be to ensure a high level of data quality, integrity, and accessibility for the purpose of extracting information, intelligence, etc. from data sets via analytic models. Driving these models will be the underlying constraints of management, administration, and governance of these data sets to include structured and unstructured data.
- Structured Data Defined - Structured data is contingent on first creating a data model of the types of data to be collected and how the data will be recorded, stored, processed and accessed. For example, business data could include formulas for “structured data types” such as numeric, currency, percentage, fraction, scientific, etc. Additionally, due to the high cost of memory and processing relational databases and spreadsheets (i.e., traditional row-column, fixed field formats) were the only ways to manage data effectively.
- Unstructured Data Defined - Unstructured data typically is information that does not reside in a traditional row-column, fixed field format or database. Unstructured data usually are files that include text and possibly multimedia content. For example, this could include emails, word ".doc" documents, "PPT" presentations, web content, audio and video files and many other types of documents, etc. that could not be stored in a database designed as rows-columns, or fixed field formats.
IDC and EMC project that data will grow to 40 zettabytes by 2020, resulting in a 50-fold growth from the beginning of 2010. Computer World states, "Unstructured information might account for more than 70%–80% of all data in organizations."
Currently, Big Data management encompasses the employment of analytic data strategies so that end-users can absorb and manage what is becoming fast-growing and expanding "pools" of data. These data pools can comprise of many "terabytes" and now growing into "petabytes" of information involving a variety of file formats so that the data can be efficiently secured, captured, analyzed and properly archived/saved (A petabyte is 1015 or 1000 gigabytes or 1 million terabytes). Effective data management from an IT/technology standpoint can assist various entities in the dissemination of valuable information driving processes that produce results.
Big Data Trends - Primarily, due in part by the emergence of evolving new social platforms. Information Technology will become synonymous with Big Data Analytics and Management. Big Data will "rule" and could come to be a disruptive and competitive differentiator in the enterprise. IT will become an enabling factor in driving new sources of business intelligence, and the capabilities associated with cloud computing delivering on its role as a driver of business growth.
Please note:
- Business leaders who view analytics as a simple extension of Business Intelligence will severely misjudge the potential of IT/Big Data/Analytics in driving business results. Traditional BI does not take into consideration the wealth and future value of unstructured data.
- IT and analytics are rapidly morphing from a technology-ecosphere that is server-centric to one that is service-centric. Meaning, systems are migrating to decouple infrastructure, applications and business processes allowing for a more accurate and focused mix of services required to optimize/align analytic capabilities across the enterprise.
- Conventional wisdom states that business process design (adopting new technology) dictates cost-reduction as an important by-product. By utilizing IT/Big Data/Analytics, tomorrow's approach will create data-driven proactive solutions/services driving by-products such as increased ROI, customer value (internal/external), customer satisfaction and competitive advantage.
Nucleus Research states, "Given returns of $10.66 for every dollar invested in analytics technologies, organizations that back at this opportunity do so at their own peril. This translates into competitive advantage."
Best Practices - As with any technology and development initiative disciplines in constructing, implementing and adhering to Best Practices will mitigate risk and produce optimal results. Taking into consideration that evolving Big Data environs go beyond databases that are relational in scope and data warehouse platforms that integrate additional technologies and capabilities should include a focus on incubating innovative ways of collecting and analyzing Big Data with the goal of developing improved platforms and architecture. Therefore, shaping Best Practices will take on a new paradigm and meaning.
- Big Data is a Business Decision - Typically IT will take on the challenge/approach of "building it first, and they will come." Decisions to proceed with implementing Big Data projects must be part of the business planning process. Big Data projects /Analytics are most successful when approached from a business perspective. IT's contribution must remain in the lead in building solution(s)/application(s) that fit the defined business needs and then providing the management backup needed to filter the information appropriately and throughout the enterprise.
- Gather/Evaluate Business and Data Requirements Before Gathering Data - Big Data requirements consist of two components: 1). Big Data begins with first gathering, analyzing and understanding the business requisites. This should be the first step in the process assuring that all data will align with the particular business function, need, etc. 2). Evaluation of all data collected should follow as this will include input from the business stakeholders. This component will dictate how the data will be collected, captured, filtered, pruned, modeled, retained, managed and made accessible.
- Steps to Implementation Must Be Agile and Iterative - In most cases, Bid Data projects start with a use-case or data set. During the process, the organization/IT must evolve as they begin to understand the data. By utilizing an Agile/Iterative approach, the implementation process will drive the use/analysis of the specific data sets, or types of data to its most efficient model for deployment. In short, start small and then identify more specific or high-valued data.
- Optimize Knowledge Transfer and Skill Sets with an IT "Center of Excellence" - Establishment of the "CoE" will assist an organization with Big Data/Solution oversight. This will provide the ability to develop standards, governance and most important the future demands of Big Data creativity, new capabilities, and how to best leverage initial/future investments in knowledge and skills, infrastructure platforms/data warehouses, BI and information architecture maturity. It can also be beneficial to enable knowledge workers with the ability to correlate emerging/different types of data and making meaningful and high-valued discoveries for the enterprise.
- Big Data Operating Model Should Align with the Cloud - Most Big Data models/initiatives that align with the cloud will allow for better control of data-flow, integration, processing, and analytical modeling. The advantages of this type of strategy provide the ability to instantly scale up and scale down demand plus offers an added benefit of quick in-and-out prototyping of new Big Data models and also can provide for enhanced security features.
- Embed Big Data/Analytics and Decision-making into Operational IT Workflow/Routine - For Big Data to become a competitive advantage, the organization must adopt new thinking regarding making "analytics" the way of doing business and part of the corporate culture. Today's data-driven organizations understand the transition point made between data as a "must have" to data as a "must do." Based on emerging analytic capabilities and the fact that Big Data can deliver competitive advantage analytics must not reside in departmental silos (i.e., marketing, supply chain, etc.). Big Data must be embedded and made part of operations with IT functioning as the front-end enabler.
- Build a Data Quality Firewall and Quality Assessment Strategy - Big Data should be viewed as a strategic asset, and like any other asset, it has a financial value. Feeding inaccurate data into the data warehouse or system will make it difficult to obtain insights and gather actionable information and the value of the data will not only decrease but potentially risks damage to the right data. By building a Data Quality Firewall and Assessment Strategy, the organization will accomplish two key imperatives: 1). Detection and blocking bad data at the point of entry prevents bad data from corrupting the enterprise information sources. 2). By tracking data quality problems, the organization will be able to develop safeguards against data received from external sources such as third-party data providers and external applications. This is especially, prevalent in legacy systems as bad data has a tendency to be buried. Benefits to the organization include better governance and data management policies and practices.
- Plan for Disruption - Today's data-centric organizations have a pretty good view concerning new data sources, methodologies and practices. Every IT manager that's engaged with Big Data knows that Big Data technology and disruptions are a way of everyday business life. They have enacted strategies and sufficient system architectures that are designed to accommodate threats and disruption. However, as new capabilities present themselves, it is the unknown that still needs to be an area of focus. Be prepared!
Developing a Big Data/Analytics initiative starts with a solid, well-defined data management strategy that relies on the selection and implementation of a cutting edge Big Data/Analytics solution/service mix. Most organizations can struggle to deliver effective processes that establish proper guidelines and methodologies. In short, a majority of organizations use only a fraction of their enterprise information systems that provide the kind of actionable Big Data/Analytic insights needed to accelerate and deliver superior business performance. This includes and is not limited to an effective organizational management team and specific business user focus initiatives that drive business requirements and leading Information Technology expertise structured to deliver Big Data leadership.
At SITEK we understand Big Data, and from an IT perspective, we know the importance of developing and implementing the appropriate components of technology, infrastructure, business flow and solution/service mix. This understanding brings the ability to provide and deliver organizations the competitive advantage in optimizing the process and value of Big Data. SITEK can provide Big Data solutions using Azure, HD Insight, Hadoop and other Big Data tools.
SITEK Can:
- Provide expertise and consulting services to chart your Big Data strategy and initiatives
- Provide capable Big Data IT application/solutions support services whether remote or on-site
- Provide cost-effective Big Data development methods and practices that minimize risk and time
- Provide specialized Big Data application development expertise meeting your business needs
- Provide scalable development solutions designed to meet your future Big Data business requirements
- Provide creative Big Data alternatives targeting tech. infrastructure, business flow and solution/service mix
Please feel free to contact SITEK so that we can discuss your business needs, priorities and offer solutions that are designed for success: www.siteksolutions.com.
About SITEK Inc., Founded in 2006 and headquartered in Lexington, Kentucky, SITEK provides technology-driven solutions for clients large and small. SITEK has delivered solutions for global clients in diverse industries including; Healthcare, Manufacturing, Utilities, Insurance, Government and Education. SITEK also provides innovative solutions to technology staffing needs. SITEK has the experience to place qualified candidates in the U.S. and internationally, delivering the right resources for any company.
SITEK - Core Competencies
- System Architecture and Design
- Application Development
- Project Management
- Document Management (SharePoint/ImageNow)
- Testing and Quality Assurance
- Placement and Recruiting
SITEK - Key Differentiators
- Proven track record
- A decade of customer satisfaction
- Complete software lifecycle experience
- Experienced in diversed technologies
- 100% minority owned small business
- Located centrally with global reach