As compelling as the leading large-scale language models may be, the fact remains that only the largest companies have the resources to actually deploy and train them at meaningful scale.
For enterprises eager to leverage AI to a competitive advantage, a cheaper, pared-down alternative may be a better fit, especially if it can be tuned to particular industries or domains.
That’s where an emerging set of AI startups hoping to carve out a niche: by building sparse, tailored models that, maybe not as powerful as GPT-3, are good enough for enterprise use cases and run on hardware that ditches expensive high-bandwidth memory (HBM) for commodity DDR.
German AI startup Aleph Alpha is one such example. Founded in 2019, the Heidelberg, Germany-based company’s Luminous natural-language model boasts many of the same headline-grabbing features as OpenAI’s GPT-3: copywriting, classification, summarization, and translation, to name a few.
The model startup has teamed up with Graphcore to explore and develop sparse language models on the British chipmaker's hardware.
“Graphcore’s IPUs present an opportunity to evaluate the advanced technological approaches such as conditional sparsity,” Aleph Alpha CEO Jonas Andrulius said in a statement. “These architectures will undoubtedly play a role in Aleph Alpha’s future research.”
Conditionally sparse models — sometimes called mixture of experts or routed models — only process data against the applicable parameters, something that can significantly reduce the compute resources needed to run them.
For example, if a language model was trained in all the languages on the internet, and then is asked a question in Russian, it wouldn’t make sense to run that data through the entire model, only the parameters related to the Russian language, explained Graphcore CTO Simon Knowles, in an interview with The Register.
“It’s completely obvious. This is how your brain works, and it’s also how an AI ought to work,” he said. “I’ve said this many times, but if an AI can do many things, it doesn’t need to access all of its knowledge to do one thing.”
Knowles, who’s company builds accelerators tailored for these kinds of models, unsurprisingly believes they’re the future of AI. “I’d be surprised if, by next year, anyone is building dense-language models,” he added.
Sparse language models aren’t without their challenges. One of the most pressing, according to Knowles, has to do with the memory. The HBM used in high-end GPUs to achieve the necessary bandwidth and capacities required by these models is expensive and attached to an even more expensive accelerator.
This isn’t an issue for dense-language models where you might need all of that compute and memory, but it poses a problem for sparse models, which favor memory over compute, he explained.
Interconnect tech, like Nvidia’s NVLink, can be used to pool memory across multiple GPUs, but if the model doesn’t require all that compute, the GPUs could be left sitting idle. “It’s a really expensive way to buy memory,” Knowles said.
Graphcore’s accelerators attempt to sidestep this challenge by borrowing a technique as old as computing itself: caching. Each IPU features a relatively large SRAM cache — 1GB — to satiate the bandwidth requirements of these models, while raw capacity is achieved using large pools of inexpensive DDR4 memory.
“The more SRAM you've got, the less DRAM bandwidth you need, and this is what allows us to not use HBM,” Knowles said.
By decoupling memory from the accelerator, it’s far less expensive — the cost of a few commodity DDR modules — for enterprises to support larger AI models.
In addition to supporting cheaper memory, Knowles claims the company’s IPUs also have an architectural advantage over GPUs, at least when it comes to sparse models.
Instead of running on a small number of large matrix multipliers — like you find in a tensor processing unit — Graphcore’s chips feature a large number of smaller matrix math units that can address the memory independently.
This provides greater granularity for sparse models, where “you need the freedom to fetch relevant subsets, and the smaller the unit you’re obliged to fetch, the more freedom you have,” he explained.
Put together, Knowles argues this approach enables its IPUs to train large AI/ML models with hundreds of billions or even trillions of parameters, at substantially lower cost compared to GPUs.
However, the enterprise AI market is still in its infancy, and Graphcore faces stiff competition in this space from larger, more established rivals.
So while development on ultra-sparse, cut-rate language models for AI are unlikely to abate anytime soon, it remains to be seen whether it’ll be Graphcore’s IPUs or someone else’s accelerator that ends up powering enterprise AI workloads. ®
AI is killing the planet. Wait, no – it's going to save it. According to Hewlett Packard Enterprise VP of AI and HPC Evan Sparks and professor of machine learning Ameet Talwalkar from Carnegie Mellon University, it's not entirely clear just what AI might do for – or to – our home planet.
Speaking at the SixFive Summit this week, the duo discussed one of the more controversial challenges facing AI/ML: the technology's impact on the climate.
"What we've seen over the last few years is that really computationally demanding machine learning technology has become increasingly prominent in the industry," Sparks said. "This has resulted in increasing concerns about the associated rise in energy usage and correlated – not always cleanly – concerns about carbon emissions and carbon footprint of these workloads."
HPE is lifting the lid on a new AI supercomputer – the second this week – aimed at building and training larger machine learning models to underpin research.
Based at HPE's Center of Excellence in Grenoble, France, the new supercomputer is to be named Champollion after the French scholar who made advances in deciphering Egyptian hieroglyphs in the 19th century. It was built in partnership with Nvidia using AMD-based Apollo computer nodes fitted with Nvidia's A100 GPUs.
Champollion brings together HPC and purpose-built AI technologies to train machine learning models at scale and unlock results faster, HPE said. HPE already provides HPC and AI resources from its Grenoble facilities for customers, and the broader research community to access, and said it plans to provide access to Champollion for scientists and engineers globally to accelerate testing of their AI models and research.
After taking serious CPU market share from Intel over the last few years, AMD has revealed larger ambitions in AI, datacenters and other areas with an expanded roadmap of CPUs, GPUs and other kinds of chips for the near future.
These ambitions were laid out at AMD's Financial Analyst Day 2022 event on Thursday, where it signaled intentions to become a tougher competitor for Intel, Nvidia and other chip companies with a renewed focus on building better and faster chips for servers and other devices, becoming a bigger player in AI, enabling applications with improved software, and making more custom silicon.
"These are where we think we can win in terms of differentiation," AMD CEO Lisa Su said in opening remarks at the event. "It's about compute technology leadership. It's about expanding datacenter leadership. It's about expanding our AI footprint. It's expanding our software capability. And then it's really bringing together a broader custom solutions effort because we think this is a growth area going forward."
The US Copyright Office and its director Shira Perlmutter have been sued for rejecting one man's request to register an AI model as the author of an image generated by the software.
You guessed correct: Stephen Thaler is back. He said the digital artwork, depicting railway tracks and a tunnel in a wall surrounded by multi-colored, pixelated foliage, was produced by machine-learning software he developed. The author of the image, titled A Recent Entrance to Paradise, should be registered to his system, Creativity Machine, and he should be recognized as the owner of the copyrighted work, he argued.
(Owner and author are two separate things, at least in US law: someone who creates material is the author, and they can let someone else own it.)
Updated Australia's federal police and Monash University are asking netizens to send in snaps of their younger selves to train a machine-learning algorithm to spot child abuse in photographs.
Researchers are looking to collect images of people aged 17 and under in safe scenarios; they don't want any nudity, even if it's a relatively innocuous picture like a child taking a bath. The crowdsourcing campaign, dubbed My Pictures Matter, is open to those aged 18 and above, who can consent to having their photographs be used for research purposes.
All the images will be amassed into a dataset managed by Monash academics in an attempt to train an AI model to tell the difference between a minor in a normal environment and an exploitative, unsafe situation. The software could, in theory, help law enforcement better automatically and rapidly pinpoint child sex abuse material (aka CSAM) in among thousands upon thousands of photographs under investigation, avoiding having human analysts inspect every single snap.
A prankster researcher has trained an AI chatbot on over 134 million posts to notoriously freewheeling internet forum 4chan, then set it live on the site before it was swiftly banned.
Yannic Kilcher, an AI researcher who posts some of his work to YouTube, called his creation "GPT-4chan" and described it as "the worst AI ever". He trained GPT-J 6B, an open source language model, on a dataset containing 3.5 years' worth of posts scraped from 4chan's imageboard. Kilcher then developed a chatbot that processed 4chan posts as inputs and generated text outputs, automatically commenting in numerous threads.
Netizens quickly noticed a 4chan account was posting suspiciously frequently, and began speculating whether it was a bot.
IBM's self-sailing Mayflower Autonomous Ship (MAS) has finally crossed the Atlantic albeit more than a year and a half later than planned. Still, congratulations to the team.
That said, MAS missed its target. Instead of arriving in Massachusetts – the US state home to Plymouth Rock where the 17th-century Mayflower landed – the latest in a long list of technical difficulties forced MAS to limp to Halifax in Nova Scotia, Canada. The 2,700-mile (4,400km) journey from Plymouth, UK, came to an end on Sunday.
The 50ft (15m) trimaran is powered by solar energy, with diesel backup, and said to be able to reach a speed of 10 knots (18.5km/h or 11.5mph) using electric motors. This computer-controlled ship is steered by software that takes data in real time from six cameras and 50 sensors. This application was trained using IBM's PowerAI Vision technology and Power servers, we're told.
IBM chairman and CEO Arvind Krishna says it offloaded Watson Health this year because it doesn't have the requisite vertical expertise in the healthcare sector.
Talking at stock market analyst Bernstein's 38th Annual Strategic Decisions Conference, the big boss was asked to outline the context for selling the healthcare data and analytics assets of the business to private equity provider Francisco Partners for $1 billion in January.
"Watson Health's divestment has got nothing to do with our commitment to AI and tor the Watson Brand," he told the audience. The "Watson brand will be our carrier for AI."
In brief AI text-to-image generation models are all the rage right now. You give them a simple description of a scene, such as "a vulture typing on a laptop," and they come up with an illustration that resembles that description.
That's the theory, anyway. But developers who have special access to OpenAI's text-to-image engine DALL·E 2 have found all sorts of weird behaviors – including what may be a hidden, made-up language.
Giannis Daras, a PhD student at the University of Texas at Austin shared artwork produced by DALL·E 2 given the input: "Apoploe vesrreaitais eating Contarra ccetnxniams luryca tanniounons" – a phrase that makes no sense to humans. But to the machine, it seemed to generate images of birds eating bugs consistently. Daras made the following claim:
In brief Miscreants can easily steal someone else's identity by tricking live facial recognition software using deepfakes, according to a new report.
Sensity AI, a startup focused on tackling identity fraud, carried out a series of pretend attacks. Engineers scanned the image of someone from an ID card, and mapped their likeness onto another person's face. Sensity then tested whether they could breach live facial recognition systems by tricking them into believing the pretend attacker is a real user.
So-called "liveness tests" try to authenticate identities in real-time, relying on images or video streams from cameras like face recognition used to unlock mobile phones, for example. Nine out of ten vendors failed Sensity's live deepfake attacks.
The Register - Independent news and views for the tech community. Part of Situation Publishing
Biting the hand that feeds IT © 1998–2022