Meta julkaisi LLaMA -nimisen uuden ison kielimallin

Lisäksi: Washington Post profiloi Riley Goodesidea ja syötesuunnittelua

Feb 27, 2023

Illustration of a speech bubble, modern style, flat, vector, clean lines, white background, blue colors UHD / Midjourney

Huomenta! ☕

Transistori on arkiaamuisin ilmestyvä teknologiauutisiin ja internet-kulttuuriin keskittyvä uutiskirje, jota sponsoroi ohjelmisto- ja datakonsultointipalveluja tarjoava Three Point Consulting. Klikkaa Transistori tilaukseen ja pysy kartalla teknologiamaailman tuoreimmista käänteistä!

Uutiset 🗞️

Meta julkaisi LLaMA -nimisen uuden ison kielimallin

Viime perjantaina Facebookin, WhatsAppin ja Instagramin emoyhtiönä tunnetun Metan tekoälytutkimusryhmä julkaisi uuden ison kielimallin nimeltä LLaMA (Large Language Model Meta AI). Mielenkiintoisen mallista tekee se, että se on merkittävästi OpenAI:n GPT-3:a pienempi (175 miljardia parametria vs. 65 miljardia parametria), mutta kykenee tutkimusryhmän vertailuarvojen pohjalta parempiin tuloksiin kuin GPT-3:

As part of Meta’s commitment to open science, today we are publicly releasing LLaMA (Large Language Model Meta AI), a state-of-the-art foundational large language model designed to help researchers advance their work in this subfield of AI. Smaller, more performant models such as LLaMA enable others in the research community who don’t have access to large amounts of infrastructure to study these models, further democratizing access in this important, fast-changing field.
Training smaller foundation models like LLaMA is desirable in the large language model space because it requires far less computing power and resources to test new approaches, validate others’ work, and explore new use cases. Foundation models train on a large set of unlabeled data, which makes them ideal for fine-tuning for a variety of tasks. We are making LLaMA available at several sizes (7B, 13B, 33B, and 65B parameters) and also sharing a LLaMA model card that details how we built the model in keeping with our approach to Responsible AI practices.

Metan julkaisun abstraktista:

In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla70B and PaLM-540B. We release all our models to the research community.

Harmillisesti Meta ei julkaissut mallia yleisesti saatavaksi, vaan sille voi hakea käyttöoikeutta jos on tutkija. Päätökselle irvailtiin Twitterissä:

Jupyter Meowbooks @untitled01ipynb

I remember a time when demanding an email form for access to anything was a big "f u" in the academic sense.

HMD esitteli uuden helposti korjattavan Nokian älypuhelimen

Nokian matkapuhelinbrändin lisensoinut HMD esitteli viikonloppuna uuden Nokia G22 -puhelimen, jonka yhtiö kertoi olevan erittäin helppo korjata itse. Puhelimella on hintaa 179 € ja siihen on mahdollista vaihtaa itse mm. akku, näyttö ja latausportti. Puhelin tulee myyntiin maaliskuun 8. päivä. The Vergeltä:

HMD has worked to make what it says are the most common smartphone repairs — replacing a broken screen, charging port, or flat battery — a simpler process on its new Nokia G22, and it’s partnering with repair specialists iFixit to provide customers with the necessary replacement parts, tools, and guides. The Nokia G22 will be available on March 8th in the UK for £149.99 (€179 / around $180) and will be sold in select global markets like Europe but not the US.
…
That’s why it’s significant that HMD is boasting how quickly you can replace the G22’s battery or screen. To emphasize the ease of repair, Adam Ferguson, HMD’s head of product, successfully replaced the battery in the Nokia G22 during a press briefing about the phone. This wasn’t as easy as swapping out a removable battery — Ferguson had to open the phone with a guitar pick-style piece of plastic and detach a delicate-looking cable to remove the battery — but the whole process took around five minutes. A similar battery swap on a previous-generation HMD phone or many competing handsets would take closer to 90 minutes, he claims.
For a screen repair “you’re probably looking at 20 minutes” for the Nokia G22, he says. Prices for the Nokia G22’s replacement parts from iFixit range from £18.99 (around $23) for a new charging port to £44.99 (around $54) for a replacement display.

Midjourney blokkasi tiettyjä sanoja syötteistä, graafisen sisällön rajoittamisen takia

Mainitaan vielä, että suosittua tekstistä-kuvaksi -mallia pyörittävä Midjourney1, on hetkellisesti rajoittanut tiettyjen englanninkielisten sanojen käyttöä mallille annetuissa syötteissä. Kieltämällä sanat, yritys haluaa vähentää graafisten kuvien luontia palvelulla. Kiellot eivät ilmeisesti ole pysyviä, vaan yhtiö yrittää hienosäätää tekoälymallejaan niin, että graafisten kuvien luonti vaikeutuu vaikka sanoja käyttäisi. MIT Technology Reviewlta:

The popular AI image generator Midjourney bans a wide range of words about the human reproductive system from being used as prompts, MIT Technology Review has discovered.
If someone types “placenta,” “fallopian tubes,” “mammary glands,” “sperm,” “uterine,” “urethra,” “cervix,” “hymen,” or “vulva” into Midjourney, the system flags the word as a banned prompt and doesn’t let it be used. Sometimes, users who tried one of these prompts are blocked for a limited time for trying to generate banned content. Other words relating to human biology, such as “liver” and “kidney,” are allowed.
Midjourney’s founder, David Holz, says it’s banning these words as a stopgap measure to prevent people from generating shocking or gory content while the company “improves things on the AI side.” Holz says moderators watch how words are being used and what kinds of images are being generated, and adjust the bans periodically. The firm has a community guidelines page that lists the type of content it blocks in this way, including sexual imagery, gore, and even the 🍑 emoji, which is often used as a symbol for the buttocks.

Suosittelut 🕵️

Washington Post profiloi Riley Goodesidea ja syötesuunnittelua

Washington Post julkaisi eilen hyvän artikkelin syötesuunnittelusta ja orastavan alan tunnetuimmasta harjoittajasta, Riley Goodesidesta. Goodeside on tietojenkäsittelytieteilijä joka on rakennellut suosittelualgoritmeja mm. deittailusovelluksia rakentavilla Ok Cupidilla ja Grindrllä.

Goodeside jätti työnsä kuitenkin vuoden 2022 aikana ja keskittyi kuukausia GPT-3:n syötesuunnitteluun. Mies palkattiin sittemmin Scale.ai -nimiseen tekoälystartuppiin työskentelemään syötesuunnittelun parissa:

He left his job and started experimenting heavily with GPT-3, constantly prodding and challenging the tool to try to learn how to focus its attention and map out where its boundaries were. In December, after some of his prompts gained attention online, Scale AI hired him to help communicate with the AI models that the company’s chief executive, Alexandr Wang, described as “a new kind of computer.”

Artikkelissa käsitellään myös PromptBase -nimistä kauppapaikkaa, jossa syötesuunnittelijat kauppaavat syötteitään muille.

Kielimallien nopean kehityksen takia, syötesuunnittelun pysyvyyttä on todella vaikea arvioida, ja varsinkin jos mallit alkavat opettaa käyttäjiään tehokkaampien syötteiden antamiseen, saattaa syötesuunnittelu jäädä lyhyen ajan ilmiöksi. Juuri nyt suunnittelijat tuottavat kuitenkin arvoa eri liiketoiminnoille ja toimivat eräänlaisina kielimallien psykologeina, niinkuin OpenAI:lla työskentelevä tekoälytutkija Andrej Karpathy tokaisi Twitterissä:

Andrej Karpathy @karpathy

8/ These examples illustrate how prompts 1: matter and 2: are not trivial, and why today it makes sense to be a "prompt engineer" (e.g. @goodside ). I also like to think of this role as a kind of LLM psychologist.

Nopeet 🚀

(€) Apple on palkannut mainosjohtojan Apple TV+ -palvelulle.
(€) Yhdysvaltain energiaministeriön mukaan coronaviruksen vuotaminen laboratoriosta vaikuttaa nyt todennäköisimmältä syyltä pandemialle.
Last of Us -sarjan ja Hogwarts Legacy -pelin suosioista huolimatta WarnerBros. Discovery tekee edelleen tappiota.
Ericsson irtisanoo 8500 työntekijää, eli n. 8 % työvoimastaan.
Twitterillä tapahtui viikonloppuna lisää irtisanomisia. Ilmeisesti irtisanottuihin kuului myös viime aikoina Twitterin tuotteesta enemmän vastuuta saanut Esther Crawford.
Mathematica ja Wolfram -kieli ovat nyt ilmaisia Raspberry Pi -tietokoneilla.
(€) Sensuuri rajoittaa kiinalaisia, isoihin kielimalleihin pohjautuvia chatbotteja.
Legendaarinen puolijohdesuunnittelija Jim Keller on kertonut perustavansa uuden puolijohteita valmistavan yrityksen YouTubettaja Sam Zeelofin kanssa.
Mitä jos palaverit ovatkin tietotyön tärkein osa?
(€) Piilaaksossa käydään nyt kauppaa käytetyillä toimistokalusteilla.

Jota itsekin käytän tämän uutiskirjeen kansikuvien luontiin

Transistori

Meta julkaisi LLaMA -nimisen uuden ison kielimallin

Lisäksi: Washington Post profiloi Riley Goodesidea ja syötesuunnittelua

Huomenta! ☕

Uutiset 🗞️

Meta julkaisi LLaMA -nimisen uuden ison kielimallin

HMD esitteli uuden helposti korjattavan Nokian älypuhelimen

Midjourney blokkasi tiettyjä sanoja syötteistä, graafisen sisällön rajoittamisen takia

Suosittelut 🕵️

Washington Post profiloi Riley Goodesidea ja syötesuunnittelua

Nopeet 🚀

Discussion about this post