High-Tech Scale-ups: Finding software bugs that often slip through.
5 marzo, 2021 por
High-Tech Scale-ups: Finding software bugs that often slip through.
Administrator
| Sin comentarios aún


Martijn Rutten is CEO and co-founder of the disruptive software company Vector Fabrics, based in Zaltbommel in the Dutch province of Gelderland. Their Wikipedia page explains they have built a set of software analysis tools that are increasingly in demand from the automotive industry which is trying to build the self-driving car. He spoke with StartupDelta editor Jonathan Marks who asked him why the world needs their technology.

“Most people realise that cars, planes and many other bits of hardware are totally dependent on the software which drives them” explains Martijn. “But only those in the software programming business realise just how complicated this has all become. In fact, the complexity of software in embedded systems, automotive and mobile devices is growing exponentially. Software stacks of 10 million to 100 million lines of code are not uncommon.”

Growth in lines of code

“The problem is that no software is ever perfect.”

“Statistically, 1 in roughly every 1000 lines of software has some kind of coding mistake in it – better known as a “bug”. Even if extreme quality assurance would get this figure down to 1 bug every 10.000 lines, the European Joint Strike Fighter aircraft (F-35), for example, will still be released with over 2,400 bugs in its on-board software. That’s rather worrying if one of those bugs ends up causing a catastrophe.”

"The lines of software code in embedded systems like vehicles are growing rapidly as the industry moves in the direction of the self-drive car. We’ve already seen projects with 300 million lines of code. Recently we saw United Airlines, The Washington Post and the New York Stock Exchange dealing with “technical issues” on the same day. Air traffic control systems in Europe have also been affected. The cause is usually a software, not a hardware problem".

Traditional verification tools need a rethink

"Today’s software is inherently dynamic: reacting to outside events, processing client data, and often dynamically allocating computer memory. Bugs relating to this dynamic behaviour are notoriously hard to detect. Once a bug triggers a problem, finding the root cause often takes weeks of developer time. The costs? Toyota spent US $2 billion in 2009/2010 recalling cars that exhibited unintended acceleration due to a “software glitch”. And last year a further 6 million recalls were announced by Toyota. Similar recall problems have plagued some US car manufacturers too".

"What makes things worse is the trend for hardware manufacturers to switch to multi-core processing. They are forced by the laws of physics to use 4, 8, or 16 cores in parallel if they want to build more computing power. But what they do traditionally is to create the hardware and then they push the challenge of using it to the software programmers.”

“Multi-threading and parallel processing on new dual- and quad-core platforms introduces more dynamicity in the software. This leads to timing-related defects such as “data races” and deadlocks. As it is not feasible to verify all possible timing variations, many of these defects will only be found by users in the field. But imagine the legal ramifications if a manufacturer sells a car with a glitch in the car’s brake software that causes a fatal accident?”

Automotive Needs Multicore

“The automotive manufacturers are realising that in order to process the huge amount of a data associated with a self-driving car, they are also forced to switch to dual-core and quad-core processors. They have developed the advanced driver assist (ADAS) features like lane-detection, automatic lighting, adaptive cruise control, pedestrian detection, automatic braking, incorporate GPS/ traffic warnings or show what is in blind spots.  And our tools are showing this trend triggers all kinds of bugs that were not there before.”

“We're getting a lot of requests for assistance from both the Japanese automotive industry as well as from military clients. We also field enquiries from those building e.g. MRI scanners or military radar applications where there is usually a lot of processing going on. All this code is written in C and C++.”

Can you briefly explain what’s going wrong?

Ok. Let's take an oversimplified example of dynamic behaviour; you have a dual-core with two processors, one processor is writing a value, the other one is reading this shared variable so it can perform some kind of action. What happens in the real world is that the timing may be slightly off. So the second processor tries to read data from the first but nothing has been written yet by the first processor. So the second processor reads rubbish instead of the true value. If that garbage data were somehow related to the acceleration of the car you have a potentially dangerous situation. That's what we call a “data race”.

Of course, there are existing tools and methods to find bugs. Most tools are based on what’s called "static analysis". That means looking at the source code, parsing it, then checking it against possible violations in your coding standard.  In automotive, for instance, they have what's called the MISRA coding standard.  That checks the software against all kinds of coding rules to evaluate if your code is safe.  While these tools may flag many potential problems, the errors in the dynamic behaviour are often missed.

Vector Fabrics Verification

The verification product suite Pareon Verify is now being rolled out by Vector Fabrics.

“With Pareon Verify, we took a different approach," says Martijn. "Rather than looking at static structure, we look at how the code in the program behaves when it is running on the processor. Dynamic analysis (also known as Runtime Analysis) focuses on the behaviour of the software. By actually executing the code, you get a global view on data and control flow. Avoiding the guesswork on how the software executes also means you don’t get false positives – reported bugs that are in-fact not a bug. This is the missing link: detect the costly bugs that existing tools and methods fail to catch.

 “We find all kinds of bugs in code that that has already been run through extensive static analysis tools, unit and integration testing. These are the real “Heisenbugs”, after the Heisenberg Uncertainty Principle. Once you look at them they seem to disappear. Developers will recognise this situation. Suppose you have a data race. The moment you look at it in a debugger, the timing changes and the bug disappears. The same thing happens with memory allocation problems – the bugs simply slips through. When customers complain their systems are crashing, it can take about 3 weeks just to reproduce the bug and may take even longer to find the root cause.”

How can Vector Fabrics scale up bearing in mind that the code is growing exponentially?

“We designed our tool suite from the start to be very easy to use and with a straightforward licensing model. This allows us to reach many customers over the internet, without having to set up a huge sales force.”

“A second aspect is that we took extreme care to be able to analyse every aspect of the C and C++ language.  This means customers can run our tools without any issues on their legacy code, stuff developed by partners, or coming right off the Internet.”

Why found a technology startup in the Netherlands?

“Because this is where high-tech software and hardware industries come together. Here is a centre of embedded software companies that are facing the challenges that we are solving.

We have our roots in Eindhoven which forms a strong cluster of embedded systems companies.  We’re very active in the Brainport region with existing clients and a new initiative to build a high-tech software cluster. But we also have easy access to the really great computer scientists at the University of Utrecht and also in Amsterdam which is the software capital of the Netherlands.”

Is building a startup the best way to disrupt an industry?

“We think so. We started as three Philips research guys on the High Tech Campus Eindhoven. We later worked in NXP Semiconductors. Yet, each of us quit our jobs for the same reason – a belief that there was a real challenge approaching for programmers of multicore processors and we had the vision to develop real tools to solve the problem.”

“We had to do a lot of disruptive thinking, closing our ears to those who said it was impossible. It is much easier to do that kind of work in a small startup, far more difficult in a large corporate where the decision loops are so much longer. So the path for us was obvious. And I think a startup incubator model is a good idea for large corporates. It allows you to identify and validate some real problems in the market and to make some dramatic technology choices.”

It’s true – A great team is key to sustained success

“If you have the right team, you can move at a lightning pace. And we're lucky that we've found the right people. Many of the candidates who approach us for a job are fascinated by the problem that our tools solve, like parallelization of software code, etc. But that doesn't mean they are specialists in algorithms or computer science to create tools to tackle the problem. So we tend to look for the real computer scientists who have a very different background. And we have attracted talent from all over the globe to our part of the Netherlands.”

But what needs to happen next?

“We need to find more diehard computer scientists, people with the capability to make something very disruptive. I think we could do more to ensure we don’t lose that talent pool. In many European universities we see that some of the world’s best computer scientists originate from the most abstract research areas: highly academic work. Increasingly, however, Universities tend not to value these departments because they don't result in an immediate technology transfer to industry. So typically, they close the departments for the wrong reasons. You can see that in Physics for example, where astronomy departments are closed. Yet that's where you find the top-notch programmers trying to crack very complex theoretical problems. That means that companies like ours, who are pushing the boundaries of technology, have no choice but to look abroad for that talent.”

“From the Dutch government side, it is still challenging for us to get an understanding of our disruptive technology and that we have to operate as a global enterprise from day one. Our launching customers are typically global companies, often based in Asia. And regional funds and incentives are often looking to support local projects with a strong link to the local customers such as Philips or ASML. While most funds have a time-span of 3 to 5 years, disruptive technology typically takes 10 years to prove itself – and another 10 years to become mainstream. But then you have something world-class. And that leaves a mark on a region like this. So I hope we will see more of a balance of mid- and long-term investments.”

Identificarse dejar un comentario