Sometimes major shifts happen unknowingly. On May 5th, Project codenet published by IB Very little media or academic attention.
Codnet is an attempt to encode what AIGnet did for computer vision: artificial intelligence (AI): a database of over 14 million code samples covering 50 programming languages to solve 4,000 code problems. This database also contains a lot of additional data such as the amount of memory required to run the software and the output log of the operating code.
Accelerate machine learning
IBM’s own stated reason for the codenet is that it was designed Quickly update older systems programmed with outdated code, A long-awaited development since then The Y2K scare 20 years agoWhen many believe that undocumented inheritance systems can fail with devastating consequences.
However, as security researchers, we believe that the most important implication of Codenet and similar projects is the ability to reduce barriers and natural language encoding (NLC).
Like companies in recent years OpenAI And Google are rapidly improving the natural language processing (NLP) technology. These are machine learning programs designed to better understand and mimic natural human language and to translate between different languages. Training machine learning systems require a large set of data with text written in the desired human languages. NLC also uses coding for all of this.
Coding is a difficult skill to learn except for a master and an experienced coder is expected to be proficient in multiple programming languages. NLC, by contrast, uses a large database of NLP technologies, such as codenet, to encode English, or ultimately French, or Chinese, or any other natural language. It allows you to design a simple website, such as “Create a red background with an image of an airplane, contact me in the middle and below my company logo” and the site itself will be the result of an automatic conversion to natural language code.
It is clear that IBM is not alone in its thinking. Used to support GPT-3, the flagship NLP version of OpenAI Encoding a website or application by writing a description of what you want. After receiving the IBM news, Microsoft announced that it has The exclusive rights to GPT-3 are protected.
Microsoft also has GitHub, the largest collection of open source code on the Internet – acquired in 2018. GitHub Copilot, AI Assistant. Once the programmer has entered the code they want to encode, Copillet generates a coding sample so that what they have specified can be realized. The programmer can accept, edit, or reject the sample created by the AI, greatly simplifying the coding process. Copilot is a big step towards NLC but it is not yet.
Consequences of natural language coding
While the NLC is not yet fully done, we are moving fast towards a future where encryption in general is more accessible to the average person. The implications are huge.
First, there are consequences for research and development. It is argued that The faster the number of potential innovators, the faster the innovation. By removing barriers to coding, programming expands the potential for innovation.
Also, it is as different as academic subjects Computer Physics and Statistical Sociology Increases reliance on custom computer programs for data processing. Reducing the skills needed to create these programs can increase the ability of researchers in specialized fields outside of computer science to apply such methods and make new discoveries.
However, there are dangers as well. Ironically, it is the destruction of the democracy of coding. There are currently a number of encoding platforms. Some of these platforms offer different features that different programmers like, but none of them offer a competitive advantage. A new programmer can easily use a “bare bone” coding terminal, which can be a minor disadvantage.
However, developing or deploying the required level of AI for NLC is not cheap and may be monopolized by major platform companies such as Microsoft, Google or IBM. Although this service is free of charge or like most social media services, it can be provided with adverse or exploitative terms for its use.
There is also reason to believe that such technology platform corporations will dominate the way machine learning works. Theoretically, programs like Copilot improve as new data is introduced: the more they are used, the better. This makes it difficult for new competitors to have a strong or ethical product.
It seems that the big centralized capitalist corporations could be the gatekeepers of the next coding revolution if not a serious retaliation.