Technological Thoughts by Jerome Kehrli

Entries tagged [data]

Big Data and private banking, what for ?

by Jerome Kehrli

Posted on Wednesday Oct 05, 2016 at 10:50AM in Computer Science

Big Data technologies are increasingly used in retail banking institutions for customer profiling or other marketing activities. In private banking institutions, however, applications are less obvious and there are only very few initiatives.
Yet, as a matter of fact, there are opportunities in such institutions and they can be quite surprising.

Big Data technologies, initiated by the Web Giants such as Google or Amazon, enable to analyze very massive amount of data (ranging from Terabytes to Petabytes). Apache Hadoop is the de-facto standard nowadays when it comes to considering Open Source Big Data technologies but it is increasingly challenged by alternatives such as Apache Spark or others providing less constraining programming paradigms than Map-Reduce.

These Big Data Processing Platform benefits from the NoSQL genes : the CAP Theorem when it comes to storing data, the usage of commodity hardware, the capacity to scale-out (almost) linearly (instead of scaling up your Oracle DB) and a much lower TCO (Total Cost of Ownership) than standard architectures.

Most essential applications for such technologies in retail banking institutions consist in gathering knowledge and insights on the customer base, customer's profiles and their tendencies by using cutting-edge Machine Learning techniques on such data.

In contrary to retail banking institutions that are exploiting such technologies for many years, private banking institution, with their very low amount of transactions and their limited customer base are considering these technologies with a lot of skepticism and condescension.

However, in contrary to preconceived ideas, use case exist and present surprising opportunities, mostly around three topics :

  • Enhance proximity with customers
  • Improve investment advisory services
  • Reduce computation costs

Read More

Data Management

by Jerome Kehrli

Posted on Thursday Aug 07, 2014 at 04:22PM in Computer Science

Data management comprises all the disciplines related to managing data as a valuable resource.
Data Management is the development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs of an enterprise.

During my MS studies, I followed two interesting lectures related to Data Management.

  • Introduction to Data Mining (Summary)
  • Introduction to Information Retrieval (Summary)

Data Mining is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. In the context of Data Mining, Data Warehouses (DW) form an important aspect. Data Warehouses generalize and consolidate data in multidimensional space. The construction of DW is an important pre-processing step for data mining involving data cleaning, data integration, data transformation.
I have summarized all the notes I have taken during the Introduction to Data Mining lecture as well as some of my solutions to the exercises within the following document : Summary of the Data Mining Lecture.

Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text) that satisfies an information need from within a large collections (usually stored on computers). Information Retrieval is a field concerned with the structure, analysis, organisation, storage, searching and retrieval of information.
Here as well I have summarized the notes taken during the lecture within the following document : Summary of the Information Retrieval Lecture.