As the desire for the enterprise to act quickly and navigate through the changing markets, this has mushroomed new generation of analytic solution to deal with the data growth and desire for discovering new insight. In past many years, I have noticed that when defining the solution for decision support, the first step has often been to build a quick solution without worrying about building sustainable infrastructure. But, soon these solutions starts hitting the wall, as they were either highly customized and or built with the least costly option!
As analytic applications are driving for prime time, organizations are faced with hard facts and consequences of their past decisions. With this increase of focus, core of Analytic solution for enhanced ROI needs to be built on the Enterprise information integration with tomorrow's needs in mind.
Capability to scale as system grows in size:
Pay attention to database, platform and network for handling of I/O for an acceptable through-put. Each system should be individually understood to understand the scalability choices to act appropriately to address the effect of growth. Especially for Analytic solutions, data integration and analytic processing are another choking point due to need for extraction of information from large data set effectively and efficiently. Effective management ingredients are key to the scalable solution.
Throwing more horse-power such that increasing processing power, more indexes so on, such brute force are short term solution before next crisis emerges. Managing scalability begins with defining an architecture that can start small but grow overtime in coordination with each component in the architecture physical or logical.
Often the case is that the actual usage facts/statistics are obtained after first couple of months of the solution use, hence most of the assumptions for usage should be critically weighed and considered. This should include the headroom for growth of physical architecture with consideration for multi-thread processing in all stages from migration to everyday operation.
Logical Architecture should ensure deployment mode with a clear understanding for what is required for tactical vs strategic decision making - when, what, where and who.
Poor Quality Data and Obsolete Content:
Managing the growth based on usage, subject focus, analytical and reporting capability can overcome costly engineering efforts under the umbrella of growth consequence. Emphasizing on the effort to capture the appropriate information (meta-data) when the data is created. Associated meta-data from the beginning would be crucial for the need of the information for decision making. Master Data management should be embedded and evolving process. Competitive advantage will be gained in the continued evaluation and improvement of the matching accuracy and trust worthy data and critical resource juice while processing at various stages.
Transforming the data:
Understand the infrastructure required to perform the ETL and ad-hoc front query is key to needs for tactical and strategic analytic needs. The stress on the transactional systems and analytic solutions are very different, eg. unlike transactional system, analytic solutions a based on utilization rates and scope of data, so on. In addition query parameters would vary quiet sharply leading to large data set. Addressing this would be a balancing act based on the need like freshness of the data, business events, security and privacy, aggregated view vs detail/trend view. Investment should not be purely dominated by tools, but in coordination of user's needs, experienced and balanced deployment team.
Monitoring and preventing the bottleneck situation or system crash should be prime role of operation's team for today's global systems with ever tightening processing window. In addition, coordination with network management to deal with insufficient bandwidth, but provide acceptable system performance is critical to success of the solution. Console view of the operations with ability to highlight the alerts with defined threshold will enable the systems not to be in crisis mode.
Key consideration should be given to separation of user experience from physical infrastructure. Semantic layer hides the complexity of underlying data sources, by providing a business representation of organizational data. The semantic layer also makes reusable report elements and powerful calculation capabilities available, allowing users to quickly access key information. In addition, you can use the semantic layer's metadata throughout your organization's BI solutions. Semantic layer should be supported by Enriched data layer for summary, details and slowly changing dimension. Deploying the KPIs at semantic layer vs at database level often causes the data fetch performance. Planning each layer with usage understanding, auditing and monitoring will enable system efficiency.
The pipeline of the changes should be sequenced such that it differentiates the core changes required for deploying an front end change. Deploying the foundational back-end elements and build on top of it to deliver the front end functionality will support the effort for the optimal return and scalability.
Scalability should be planned for, as early as pilot project phase to cope with consequences of growth.