by Jerome Kehrli
Posted on Saturday Oct 08, 2016 at 12:19AM in Computer Science
Ethical hacking is a very interesting field, and a pretty funny hobby. Well, ya all penetration tester out there, don't get me wrong: I am well aware that Penetration testing and Ethical Hacking is a full and challenging Software Engineering Field and an actual profession, don't get upset.
I am rather saying that studying vulnerabilities exploitation techniques in one's free time is pretty fun and intellectually rewarding. With the all time and everywhere connection of everything for all kind of usages (understand Internet of Things), current focus in the field of vulnerabilities exploitation is really on Web application, networks, distributed systems, etc.
In addition, most recent progresses in CPU-level protections and compiler-level protections have made local programs exploitation techniques somewhat outdated and such techniques are not very much presented or discussed anymore.
During my master studies, I followed an extended set of lectures on Ethical Hacking and Software Security in general and got quite interested in the field. I wrote a paper for a study in the context of the university at that time which I am reporting today in this blog.
The following article presents various classical vulnerabilities exploitation techniques on local programs.
by Jerome Kehrli
Posted on Friday Oct 07, 2016 at 12:01AM in Computer Science
I interested myself deeply in the blockchain topic recently and this is the first article of a coming whole serie around the blockchain.
This article presents an introduction on the blockchain, presents what it is in the light of its initial deployment in the Bitcoin project as well as all technical details and architecture concerns behind it.
We won't focus here on business applications aside from what is required to present the blockchain purpose, more concrete business applications and evolutions will be the topic of another post in the coming days / weeks.
This article presents and explains all the key techniques and mechanisms behind the blockchain technology.
The blockchain principles and fundamentals are really coming initially from the design work on the Bitcoin. Most of this article focuses on the design and the principle of the blockchain put in place in the Bitcoin system.
Some more recent (Blockchain 2.0) implementations differ slightly while still sharing most genes with the original blockchain, making all that is presented below valid from a conceptual perspective in these other implementations as well.
by Jerome Kehrli
Posted on Wednesday Oct 05, 2016 at 05:17PM in Computer Science
The blockchain and blockchain related topics are becoming increasingly discussed and studied. There is not one single day where I don't hear about it, that being on linkedin or elsewhere.
I kept myself busy on other topics these last years, mostly large scale information systems and analytic systems architecture in the finance business so I really missed the Bitcoin and blockchain hype.
I've been to an OCTO Technology event recently on the Blockchain. To be honest I went there more for the pleasure of seeing my former colleagues than for any specific interest on the topic. Yet I listen carefully to OCTO's presentation ... and I didn't imagine I would be so much intrigued and soon passionated by the subject.
I strongly believe the blockchain technology has the potential to be one of the most disruptive progress in computer sciences of these 10 last years. I studied and keep studying all the technical details, evolutions and business implications of this technology and will post various blog articles in the coming days / weeks about this topic:
- First article is : Blockchain explained. I am giving a clear explanation of all the technical nuts and bolts behind the blockchain technologies.
- Second is : Blockchain 2.0 - from bitcoin transactions to blockchain applications
- Third one will be : Blockchain - various business opportunities where I will discuss how the blockchain can (and will) disrupt several fields.
by Jerome Kehrli
Posted on Wednesday Oct 05, 2016 at 10:50AM in Big Data
Big Data technologies are increasingly used in retail banking institutions for customer profiling or other marketing activities. In private banking institutions, however, applications are less obvious and there are only very few initiatives.
Yet, as a matter of fact, there are opportunities in such institutions and they can be quite surprising.
Big Data technologies, initiated by the Web Giants such as Google or Amazon, enable to analyze very massive amount of data (ranging from Terabytes to Petabytes). Apache Hadoop is the de-facto standard nowadays when it comes to considering Open Source Big Data technologies but it is increasingly challenged by alternatives such as Apache Spark or others providing less constraining programming paradigms than Map-Reduce.
These Big Data Processing Platform benefits from the NoSQL genes : the CAP Theorem when it comes to storing data, the usage of commodity hardware, the capacity to scale-out (almost) linearly (instead of scaling up your Oracle DB) and a much lower TCO (Total Cost of Ownership) than standard architectures.
Most essential applications for such technologies in retail banking institutions consist in gathering knowledge and insights on the customer base, customer's profiles and their tendencies by using cutting-edge Machine Learning techniques on such data.
In contrary to retail banking institutions that are exploiting such technologies for many years, private banking institution, with their very low amount of transactions and their limited customer base are considering these technologies with a lot of skepticism and condescension.
However, in contrary to preconceived ideas, use case exist and present surprising opportunities, mostly around three topics :
- Enhance proximity with customers
- Improve investment advisory services
- Reduce computation costs
by Jerome Kehrli
Posted on Tuesday Aug 30, 2016 at 09:02AM in Computer Science
I have written a little Sudoku program for which I provide here the source code and Windows pre-built executables. Current version is Sudoku 0.2-beta.
It supports the following features:
- A GUI to display and manipulate a Sudoku board
- A Sudoku generator
- A Sudoku solver
- A Sudoku solving tutorial (quite limited at the moment)
At the moment there are two resolution methods supported, one using human-like
resolution techniques and a second using backtracking. The resolution of a solvable
Sudoku board takes a few milliseconds only.
A solvable Sudoku board is a Sudoku board than has one and only one solution.
The Sudoku board generator generates solvable Sudoku boards. It usually generates boards
between 18 and 22 pre-filled cells. (which is quite better than most generators I could
Currently it generates the best (i.e. most difficult) board it possibly can provided the random initial situation (with all cells filled) of the board.
The problem I'm facing in this current version is that it can take from a few seconds only up to several minutes to generate the board (this is discussed in the algorithms section below).
In addition, the difficulty of the resulting board is not tweakable at the moment. In some cases it generates an extremely difficult board (only solvable with nishio) in a few seconds while some other times it needs two minutes to generate an easy board.
The software is written in C++ on Linux with the help of the
wxWidgets GUI library.
It is cross-platform to the extent of the wxWidgets library, i.e. it can be compiled on Windows, OS X, BSD, etc.
A makefile is provided to build it, no configure scripts, i.e. you
need to adapt the makefile to your own Linux distribution (or on Windows Mingw, TDM, etc.).
Happily only few adaptations should be required.
by Jerome Kehrli
Posted on Thursday Aug 07, 2014 at 04:22PM in Computer Science
Data management comprises all the disciplines related to managing data as a valuable resource.
Data Management is the development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs of an enterprise.
During my MS studies, I followed two interesting lectures related to Data Management.
Data Mining is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. In the context of Data Mining, Data Warehouses (DW) form an important aspect. Data Warehouses generalize and consolidate data in multidimensional space. The construction of DW is an important pre-processing step for data mining involving data cleaning, data integration, data transformation.
I have summarized all the notes I have taken during the Introduction to Data Mining lecture as well as some of my solutions to the exercises within the following document : Summary of the Data Mining Lecture.
Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text) that satisfies an information need from within a large collections (usually stored on computers). Information Retrieval is a field concerned with the structure, analysis, organisation, storage, searching and retrieval of information.
Here as well I have summarized the notes taken during the lecture within the following document : Summary of the Information Retrieval Lecture.
by Jerome Kehrli
Posted on Friday Dec 06, 2013 at 04:11PM in Computer Science
For some reasons that I'd rather keep private, I got interested in the kind of questions google, microsoft, amazon and other tech companies are asking to candidate during the recruitment process. Most of these questions are oriented towards algorithmics or mathematics. Some other are logic questions or puzzles the candidate is expected to be able to solve in a dozen of minutes in front of the interviewer.
If found various sites online providing lists of typical interview questions. Other sites are discussing topics like "the ten toughest questions asked by google" or by microsoft, etc.
Then I wondered how many of them I could answer on my own without help. The truth is that while I can answer most of these questions by myself, I still needed help for almost as much as half of them.
Anyway, I have collected my answers to a hundred of these questions below.
For the questions for which I needed some help to build an answer, I clearly indicate the source where I found it.
by Jerome Kehrli
Posted on Tuesday Mar 27, 2012 at 01:10AM in Computer Science
I want to share an interesting project that has appeared on the Web recently : the AirXCell project.
(As some of you already know, I am somewhat involved in this project :-)
AirXCell is an online R application framework currently supporting a programmable spreadsheet and an R development environment.
AirXCell is based on R - The GNU R Project for Statistical Computing. Current version is still somewhat limited yet fully functional.
Quoting the AirXCell User documentation :
AirXCell intents to revolution the world of spreadsheet applications and computational software by providing a product that:
- merges the world of spreadsheet application (e.g. Microsoft Excel, GNUmeric, etc.) and the world of computational software (e.g. Mathematica, Mathlab, etc.) and
- revolutions the usual approach in spreadsheet applications.
by Jerome Kehrli
Posted on Monday Mar 26, 2012 at 12:30AM in General
During my MSc studies, I followed an extended set of very interesting lectures related to Mathematical Optimization using basic mathematic concepts and simple algorithms such as the Newton (and/or Newton-based) methods or the simplex algorithm (and/or simplex based such as branch-and-bound, branch-and-cut, etc.).
"In the simplest case, an optimization problem consists of maximizing or minimizing a real function by systematically choosing input values from within an allowed set and computing the value of the function. The generalization of optimization theory and techniques to other formulations comprises a large area of applied mathematics.
More generally, optimization includes finding best available values of some objective function given a defined domain, including a variety of different types of objective functions and different types of domains."
I have these days a (very) little more than usual free time and I've compiled a resume of these lectures from my various notes and individual chapters resumes. So I decided to put this document online as it might help some of the future MSc students following any lecture related to Mathematical Optimization by providing them with an introduction to the field.
The resume is available here : resume_optim.pdf.
by Jerome Kehrli
Posted on Sunday Oct 30, 2011 at 07:45PM in General
I am really amazed and astonished by a few updates I've been seeing on linkedin recently.
I've been working these ten last years with incredibly gifted people. You know, the kind of guys you discuss with wondering whether you yourself will ever be as good, clever and keen as them. I really think being that good is nothing to be ashamed of so let's assume I can name these guys. The very first one I remember is Thomas Beck (Geneva, Switzerland) . I've been working two years under his supervision (he was the software architect on our project) and I have learn more about the job discussing with him than I ever did reading whatever software architecture or design related book (agile, DDD, whatever). Happily I have learn a lot more since I left him yet I'm quite sure he did even more so I believe I'm still far from reaching his level of mastering of the software architecture business.
Other people I would also mention here are Sebastien Ursini, Sebastien Marc and Thomas Caprez (Geneva and Lausanne / Switzerland). I haven't seen these folks since several years for some of them yet I can still pretty clearly remember what they taught me and there's not one single day where I don't benefit from these teachings in my job.
On the other hand, just as everybody, I really had much more often the occasion to work with terrible software engineers. I principally encountered two categories.
The first one is this kind of people that went to great engineering schools or universities and assume the time they invest in their studies is well enough and exempts them from providing any little additional effort to keep learning since they graduated. These people are fools believing they're great only because of some piece of paper assessing they have once been able to learn something. I hope all my very good french colleagues won't hate me for this but I have to say that specifically french engineers are subject to this bad tendency.
Unfortunately, life doesn't make any gift to anyone and most of them are sooner or later taught the hard way how they're wrong and start kicking their buts to actually start learning the job and make some progress.
The second category is way more dangerous. This is the kind of people that sell themselves as software architects without any real software development experience. These folks read lots of books, follow lots of software architecture blogs and assume that this exempts them from building their own experience before claiming being software architects. I'm not saying reading is not good, but I am pretty sure that it is in no way comparable to experience. Unfortunately, due to poor recruitment processes one one side, and the lack of good software engineers on the market on the other side, these guys manage to find a software architect job and end up taking architecture-level decisions.
I am involved in the recruitment process in my current company (just as I was in my former companies). I take care of the technical assessment. I myself am usually a nice guy (well I think) and yet I show no mercy to candidates. I am pretty well aware that a mistake I make in this process might well lead me to work with bad engineers a few months later and this is a risk I'm not willing to take at all.
I am the guy killing those people. When I see someone coming in front of me with a resume claiming several years of experience in software architecture and not able to answer correctly the very first questions I'm asking him, it usually puts me in such a bad mood that I still keep the guy for the two hours that were planned and bury him 7 feet under ground. Hopefully the guy will work on a resume a little more humble before applying to another position (in another company, needless to say).
Just a word on "answering correctly": there is usually not only one good answer to a design problem or an architectural question, neither do I expect one. But I expect the candidate at least to build a proper conceptual model of the issue I'm presenting and to be able to outline a few solutions.
Now why am I putting all this online ?Read More
by Jerome Kehrli
Posted on Sunday Dec 26, 2010 at 10:03AM in Java
Following the initial release of the niceideas-commons package here : niceideas-commons 1.0-alpha-0.7, the niceideas-commons 1.1-beta-0.1 is released today.
Major changes are :
- Basic relation mapping support added to the DAO framework
- More helper and utilities related to resource finding and loading
- More utilities of various kinds
- Various bug fixes
by Jerome Kehrli
Posted on Saturday Nov 13, 2010 at 09:08PM in Java
I remember the introduction of the brand new enum type in Java 5 (1.5) was a very exciting announce. However, when I finally switched from 1.4 to 1.5 and actually tried Java's flavoured enum types, I was a bit disappointed.
Before that, I was using Josh Bloch's "Typesafe enum" pattern (effective java) for quite a long time and I didn't really see what was so much better with the new Java native enum construction. Ok, fine, there was the ability to use enum instances in
switch - case statements which seemed fine, but what else ?
Besides, what I used to find great with the "typesafe enum" pattern is that it could be tricked and changed the way I wanted, for instance to be able to dynamically (at runtime) add enum instances to a specific typesafe enum class. I found it very disappointing not to be able to do the very same thing easily with the native Java enum construction.
And now you might wonder "Why the hell could one ever need to dynamically add enum values ?!?". You do, right ? Well, let's imagine this scenario:
You have a specific column in a DB table which contains various codes as values. There are more than hundred different codes actually in use in this column. Related to this, you have a business logic which performs different operations on the rows coming from this table, the actual kind of operation applied on the row depends on the value of this code. So there are chance you end up with a lot of
if - elseif statements checking the actual value of the code.
I myself am allergic to using string comparison in conditions so I want to be able to map the values from this column to an enum type in Java. This way I can compare enum values instead of strings in my conditions and reduce my dependency on the format of the string value.
Now when there are more than a hundred different possible codes in the DB I really don't have any intent to define them all manually in my enum type. I want to define only the few I am actually using the Java code and let the system add the other ones dynamically, at runtime, when it (the ORM system or whatever I am using for reading the DB rows) encounters a new value from the DB.
Hence my need for dynamically added enum values.
So recently I faced this need once again and took a few hours to build a little solution which enables one to dynamically add values to a Java enum type. The solution is the following :Read More
by Jerome Kehrli
Posted on Wednesday Nov 03, 2010 at 08:40AM in Java
I've been facing an interesting problem with string manipulation in Java lately at work. The requirement was the following :
We have a field on some screen where the user can type in a comment. The comment can have any length the user wants, absolutely any. Should he want to type in a comment of a million characters, he should be able to do so.
Now the right way to store this comment in a database is using a CLOB, a BLOB or a LONGVARCHAR or whatever feature the database natively provides to do so. Unfortunately that's not the way it was designed. Due to legacy integration needs, all these advance DB types are prohibited within our application. So the way we have to store the comment consists of using several rows with a single comment field of a maximum length of 500 characters. That means the long comment has to be split in several sub-strings of 500 characters and each of them is stored in a separate row in the DB table. The table has a counter as part of the primary key which is incremented for each new row belonging to the same comment. This way we can easily spot every row part of the same comment.
Now another problem we have is that under DB2 a field defined as
VARCHAR(500) can contain 500 bytes max even though the strings are encoded in UTF-8 in the database. That means we might not be able to store 500 characters if the string contains one or more 2 bytes UTF-8 characters. Working in a french environment, this happens a lot.
So we had to write a little algorithm taking care of the splitting of the string in 500 bytes sub-strings.
The very first version of our algorithm was quite stupid and ended up in splitting the string in a quite naive way: we converted the string to a byte array following an UTF-8 encoding and split the byte array instead of the string. Then each of the 500 bytes arrays was converted back to a string before being inserted in the database.
Happily, we figured out quite soon that this doesn't work as it ends up quite often splitting the string right in the middle of a 2 bytes character. The byte arrays being then converted back to strings, the split 2 bytes character was corrupted and could not be corrected any more.
Before writing as smarter version of the algorithm which would manually test the byte length of the character right at the position of the split, we took a leap backward and wondered : "Can it be that Java doesn't offer natively a simple way to do just that ?"
And the answer is yes of course.Read More
by Jerome Kehrli
Posted on Sunday Oct 24, 2010 at 10:29PM in Java
CommunityBoard is a sample multi-module maven / glassfish / eclipse Java EE project.
It realizes is a little Forum / Note publishing application. Its main purpose it to act as an introducing laboratory to Java EE programming. As such the functionalities are rather limited. Yet it covers the most fundamental aspects or issues with Java EE programing in the way it show hows to :
- write entity beans with bi-directional relationship;
- use these Entity beans in EJBs (Stateless session beans);
- use other EJBs in EJBs;
- use EJBs in a servlet or a JSP located in a WAR (i.e. no processing of the
- build a multi-module Java EE maven project with jars, wars, ears;
- how to write JSPs with the JSTL (Ok I am not very proud of these JSPs yet they do the job) and
- deploy a multi-module ear within Glassfish and use a container defined datasource
by Jerome Kehrli
Posted on Thursday May 06, 2010 at 11:03PM in Computer Science
I've been working a few years ago on an architectural concept for some very specific piece of software my former company had to develop. The technical challenges were huge and the field was pretty complex. In addition, the timeframe was very little and we have had to rush a lot to get it ready and prototyped in time.
In the end we screwed up ... totally. The concept was miles away from what was required and we pretty much had to start it all over. Months of work were just good enough to be thrown away with the trash.
Not used at all to such failures, I decided to take some time to understand what happened, what went wrong.
My investigations led to the following story, a pretty funny though quite common developer tale.Read More