Computer vision is approaching a crucial milestone

Computer vision, a well-known and very useful branch of AI, is approaching a small revolution thanks to advances in automated labeling systems.

For humans, object recognition is child’s play. In fact, it is a skill that we have been actively developing since we were little, and the human brain is fantastically gifted at performing such tasks with formidable precision.

It is even so gifted that machines regularly ask for our opinion on the subject; this is the case every time we complete certain types of captchas. In fact, it’s not uncommon for Google and its spouses to regularly ask you which box has a sailboat, a crosswalk, or one of those annoying traffic lights.

Internet users’ responses are then used as references to improve the reliability of autonomous systems based on artificial intelligence. By multiplying the samples in this way, the big names in AI hope that their creations will become more and more reliable.

But today it seems obvious that this approach is not enough on its own; if it were enough to ask the general public if an image represents a bus or a truck to form a generalist AI called “fort”, Our civilization would have changed completely a long time ago.

Data tagging is a real headache that takes a long time for researchers working with AI applied to images. © Tim Gouw – Unsplash

Labeling, the ordeal of AI researchers

When it comes to developing systems that are both ultra-accurate and reliable, very few approaches are viable, and relying on exasperated Internet users is definitely not one of them. Instead, it takes time to ensure the reliability of the information that the neural network will ingest.

If approximated, the AI ​​will be based on erroneous data and therefore the results will not necessarily be significant. In short: “gTree inside, rubbish outside “ (waste at the entrance, waste at the exit), as the specialists in the discipline say.

This requires researchers to select images one by one to “draw” colored masks on them, each delimiting an element that is supposed to be identified by AI, as shown below. This is called data tagging.

It is a very long process that counts regularly in hundreds of hours. In fact, the databases in question gather thousands, even hundreds of thousands of images. And it’s not a matter of rushing work; the quality of the final product depends directly on the amount of data available.

Therefore, this approach seems overly paradoxical, if not archaic in a field as specialized as research in AI. This is a problem all the more significant as this time could be devoted to background work, much more important for the development of this technology.

Therefore, researchers are trying to develop systems capable of performing this overly ungrateful task for them. So far, the results have always been varied in terms of quality. In addition, this approach involves working pixel by pixel; you don’t have to be a great computer scientist to understand that this quickly becomes a computing power problem. After all, it involves using hundreds of thousands of images that need to be consistent from start to finish.

An algorithm for pre-chewing work

The human brain remains the great specialist in this discipline. But the latest work from MIT researchers detected by Engadget may have come significantly reduce the gap. With the help of Cornell University and Microsoft, MIT developed an algorithm called STEGO. Its goal is to tag images autonomously in record time with pixel accuracy.

The idea is that these algorithms can define coherent sets largely automatically so that we don’t have to do it ourselves.“explains Mark Hamilton, lead author of the study.

To achieve this, this algorithm analyzes the entire data set for recurring objects, which appear several times on the images. “He then associates them to build a consistent end result on all the images from which he learns”Explains the team in a press release.

The researchers then compared the results of STEGO with other standalone labeling systems. And the result was quite shocking. They explain that at least STEGO appeared twice as efficient as their counterparts. This is the first time such an algorithm has aligned almost perfectly with human-tagged control images.

It is a great progress; this could allow many researchers to drastically increase the speed at which large data sets can score. But it will also be very simplistic to limit the impact of autonomous systems like STEGO to simple productivity.

The tagged (bottom) images were generated by STEGO autonomously. © Hamilton i. paragraph.

It transcends human boundaries once and for all

The main advantage of this method is to be able to identify complex patterns that humans are not able to accurately label. “Whether you’re looking at oncological scans, images of a planet’s surface, or high-resolution microbiological images, it’s hard to know where to look without being a true expert.“, Explain the researchers.

In some areas, even human experts do not know what the objects in question are like.“adds Hamilton. “In these situations where we operate on the frontiers of science, we cannot rely on humans to understand before the machine.“, Specifies.

Thus, such a self-controlled system could work wonders in certain areas. Just think about cancer diagnosis or environmental recognition in autonomous vehicles. But it’s just the tip of a huge iceberg possible applications.

There is still work to be done to reach this stage. For example, as it stands, STEGO still suffers from some limitations. For example, it is possible to make it completely out of control by sending an eccentric image like a banana placed in the receptacle of a landline phone. How good old man garbage inside, garbage outside therefore it is still in force. But it is not it’s probably just a matter of time before STEGO and his successors arrived mature enough to give rise to aa a real revolution in this important niche of artificial intelligence.

Leave a Comment