Radar Trends to Watch: August 2022

The large model train keeps rolling on. This month, we’ve seen the release of Bloom, an open, large language model developed by the BigScience collaboration, the first public access to DALL-E (along with a guide to prompt engineering), a Copilot-like model for generating regular expressions from English-language prompts, and Simon Willison’s experiments using GPT-3 to explain JavaScript code.

On other fronts, NIST has released the first proposed standard for post-quantum cryptography (i.e., cryptography that can’t be broken by quantum computers). CRISPR has been used in human trials to re-engineer a patient’s DNA to reduce cholesterol. And a surprising number of cities are paying high tech remote workers to move there.

Artificial Intelligence

  • Regardless of where a company is based, to avoid legal problems later, it’s a good idea to build AI and other data-based systems that observe the EU’s data laws.
  • Public (beta) access to DALL-E is beginning! It might take a while to get in because there are over a million on the waitlist. Accepted users get 50 free credits the first month, 15/month thereafter; a credit allows you to give one prompt, which returns 4 images. Users can buy additional credits.
  • Researchers have used reinforcement learning to build a robotic dog that learns to walk on its own in the real world (i.e., without prior training and use of a simulator).
  • Princeton held a workshop on the reproducibility crisis that the use of machine learning is causing in science. Evaluating the accuracy of results from machine learning is a problem that most scientific disciplines aren’t yet equipped to deal with.
  • Microsoft has revised its Responsible AI standard, making recommendations more concrete, particularly in the areas of accountability, transparency, fairness, safety, privacy, and inclusiveness. Microsoft also provides tools and resources to help developers build responsible AI systems.
  • The Dallery Gallery has published a Prompt Engineering Guide to DALL-E. (DALL-E is maintaining a waitlist for free trial accounts.)
  • Simon Willison has successfully used GPT-3 to explain how code works. It is amazingly good and, as Simon pointed out, works both on code that he understands, and code that he doesn’t.
  • Bloom, the open and transparent large language model developed by the BigScience group, is finished!  You can try it out, download it, and read its specifications. Unlike all other large language models, Bloom was developed in public, and is open to the public.
  • Radiologists outperform AI systems operating by themselves at detecting breast cancer from mammograms. However, a system designed to collaborate with radiologists in making decisions is better than either radiologists or AI alone. (The big question is whether these results hold up when taken to other hospitals.)
  • You liked Copilot? Try Autoregex: GPT-3 to generate regular expressions from natural language descriptions.
  • No Language Left Behind (NLLB) is a Meta AI project that translates text directly between any pair of over 200 languages. Benchmarks, training code, and models are all open source.
  • Democratic AI is an experiment in human-in-the-loop design that enables an AI system to design a social mechanism with human collaboration.
  • The Allen Institute, Microsoft, and others have developed a tool to measure the energy use and emissions generated by training AI models on Azure. They have found that emissions can be reduced substantially by training during periods when renewable power is at its peak.
  • Minerva is a large language model that Google has trained to solve quantitative reasoning (i.e., mathematics) problems, generating simple proofs in addition to answers. The problem domain extends through pre-calculus, including algebra and geometry, roughly at a high school level. Minerva has also been trained and tested in chemistry and physics.

Security

  • Perhaps the scariest exploit in security would be a rootkit that cannot be detected or removed, even by wiping the disk and reinstalling the operating system. Such rootkits were recently discovered (one is named
    CosmicStrand

    ); they have apparently been in the wild since 2016.

  • AWS is offering some customers a free multi factor authentication (MFA) security key.
  • Lost passwords are an important attack vector for industrial systems. A system is installed; the default password is changed; the person who changed the password leaves; the password is lost; the company installs password recovery software, which is often malware-infested, to recover the password.
  • new technique for browser de-anonymization is based on correlating users’ activities on different websites.
  • Ransomware companies are now using search engines to allow their users to search the data they have stolen.
  • Ransomware doesn’t get as much attention in the news as it did last year, but in the past week one ransomware operation has shut down and released its decryptors, and two new ones (
    RedAlert

    and omega) have started.

  • Apple has added “lockdown mode” to iOS.  Lockdown mode provides an extreme degree of privacy; it is intended for people who believe they are being targeted by state-sponsored mercenary spyware.
  • The Open Source Security Mobilization Plan is an initiative that aims to address major areas of open source security, including education, risk assessment, digital signatures, memory safety, incident response, and software supply chain management.
  • Mitre has released their annual list of the 25 most dangerous software weaknesses (bugs, flaws, vulnerabilities).
  • Patches for the Log4J vulnerability were released back in February, 2022, but many organizations have not applied them, and remain vulnerable to attack.

Programming

  • Microsoft and Oracle have announced Oracle Data Service, which allows applications running on Azure to manage and use data in Oracle’s cloud. It’s a multicloud strategy that’s enabled by the cloud providers.
  • Google has announced a new programming language, Carbon, that is intended to be the successor to C++. One goal is complete interoperability between Carbon and existing C++ code and libraries.
  • How to save money on AWS Lambda: watch your memory!  Don’t over-allocate memory. This probably only applies to a few of your functions, but those functions are what drive the cost up.
  • SocialCyber is a DARPA program to understand the internals of open source software, along with the communities that create the software. They plan to use machine learning heavily, both to understand the code and to map and analyze communications within the communities. They are concerned about potential vulnerabilities in the software that the US military depends on.
  • WebAssembly in the cloud? Maybe it isn’t just a client-side technology. As language support grows, so do the kinds of applications Wasm can support.
  • surveyreports that 62% of its respondents were only “somewhat confident” that open source software was “secure, up-to-date, and well-maintained.”  Disappointing as this may be, it’s actually an improvement over prior results.
  • Is low-code infrastructure as code the future of cloud operations?
  • Tiny Core Linux is amazingly small: a 22MB download, and runs in 48MB of RAM. As a consequence, it’s also amazingly fast. With a few exceptions, making things small has not been a trend over the past few years. We hope to see more of this.
  • Yet another JavaScript web framework? Fresh does server-side rendering, and is based on Deno rather than NodeJS.

Web

  • Facebook is considering whether to rescind its bans on health misinformation. The pandemic is over, after all. Except that it isn’t. However, being a conduit for health misinformation is clearly profitable.
  • Priority Hints are a way for web developers to tell the browser which parts of the page are most important, so that they can be rendered quickly. They are currently supported by the Chrome and Edge browsers.
  • HotwireHTMX, and Unpoly are frameworks for building complex web applications while minimizing the need for complex Javascript. Are they an alternative to heavyweight JavaScript frameworks like React? Could a return to server-side web applications lead to a resurgence of platforms like Ruby on Rails?
  • Facebook has started encrypting the portions of URLs that are used to track users, preventing the Firefox and Brave browsers from stripping the tracking portion of the URL.
  • A priori censorship?  A popular cloud-based word processor in China has been observed censoring content upon the creation of a link for sharing the content. The document is locked; it cannot be edited or even opened by the author.
  • The Pirate Library Mirror is exactly what it says: a mirror of libraries of pirated books. It is focused on the preservation of human knowledge. There is no search engine, and it is only accessible by using BitTorrent over TOR.

Web3

  • Minecraft has decided that they will not “support or allow” the integration of NFTs into their virtual worlds. They object to “digital ownership based on scarcity and exclusion.”
  • Mixers are cryptocurrency services that randomize the currency you use; rather than pay with your own coin, you deposit money in a mixer and pay with randomly selected coins from other users. It’s similar to a traditional bank in that you never withdraw the same money you deposited.
  • So much for privacy. Coinbase, one of the largest cryptocurrency exchanges, sells geolocation data to ICE (the US Immigration and Customs Enforcement agency).

Quantum Computing

  • Quantum computers aren’t limited to binary: That limit is imposed by analogy to traditional computers, but some quantum computers have access to more state, and taking advantage of those states may make applications like simulating physical or biological systems easier.
  • Is quantum-aided computing for some industrial applications just around the corner? IonQ and GE have announced a results from a hybrid system for risk management. The quantum computer does random sampling from probability distributions, which are computationally expensive for classical computers; the rest of the computation is classical.
  • Quantum networking is becoming real: researchers have created entangled qubits via a 33-mile fiber optic connection. In addition to their importance for secure communications, quantum networks may be a crucial step in building quantum computers at scale.
  • NIST has announced four candidate algorithms for post-quantum cryptography. While it may be years before quantum computing can break current algorithms, many organizations are anxious to start the transition from current algorithms.

Biology

  • Not long ago (2020), DeepMind released AlphaFold, which used AI to solve protein folding problems. In 2021, they announced a public database containing the structure of a million proteins. With their latest additions, that database now contains the structure of over 200 million proteins, almost every protein known to science.
  • motor made of DNA!  This nanoscale motor uses ideas from origami to fold DNA in a way that causes it to rotate when an electrical field is applied.
  • An electrode has been implanted into the brain of an ALS patient that will allow them to communicate thoughts via computer. The patient has otherwise lost the ability to move or speak.
  • Genetic editing with CRISPR was tested in a human to permanently lower LDL (“bad cholesterol”) levels. If this works, it could make heart attacks much rarer, and could be the first widespread use of CRISPR in humans.

Energy

Work

  • Some cities (largely in the US South and Midwest) are giving cash bonuses to tech workers who are willing to move there and work remotely.
  • The FBI is warning employers that they are seeing an increasing number of fraudulent applications for remote work in which the application uses stolen personal information and deepfake imagery.

Leave a Reply