Alphazero Archives - Page 2 of 4 - Conscious Evolution TV Conscious Evolution TV

Page 2123 4

Archive for the ‘Alphazero’ Category

Toronto scientists help create AI-powered bot that can play chess like a human – blogTO

Posted: January 31, 2021 at 8:53 am

If you know anything about the intersection of chess and technology, you're likely familiar with IBM's famous "Deep Blue" the first ever computer to beat a reigning (human) world champion at his own game back in 1997.

A lot has happened in the world of artificial intelligence since that time, to the point where humans can no longer even compete with chess engines. They're simply too powerful.

In fact, according to the authors of an exciting new paper on the subject, no human has been able to beat a computer in a chess tournament for more than 15 years.

Given that human beingsdon't generally like to lose, playing chess with a purpose-built bot is nolonger enjoyable. But what if an AI could be trained to play not like a robot, but another person?

Meet Maia, a new "human-like chess engine" that has been trained not to beat people, but to emulate them, developed by researchers at the University of Toronto, Cornell University and Microsoft.

Using the open-source, neural network-based chess engine Leela, which is based on DeepMind'sAlphaZero, the scientists trained Maia usingmillions of actual online human games "with the goal of playing the most human-like moves, instead of being trained on self-play games with the goal of playing the optimal moves."

Nine different Maias were actuallydeveloped to account for varying skill levels, all of them producing different (but incredibly positive) results in terms of how well they could predict the exact moves of human players in actual games.

According to the paper, which is co-authored by U of T's Ashton Anderson,Maia now"predicts human moves at a much higher accuracy than existing engines, and can achieve maximum accuracy when predicting decisions made by players at a specific skill level in a tuneable way."

Maia takes skill level into account when predicting which moves its human opponents will take. Here, the chess enginepredicts that people will stop playing a specific wrong move when they're rated around 1500. Image via the Maia team.

Cool? Certainly and the project is already helping online chess buffs play more enjoyable matches. But the implications of the research actually go far beyond online games.

The root goal in developing Maia, according to the study's authors, was to learn more about how to improve human-AI interaction.

"As artificial intelligence becomes increasingly intelligent in some cases, achieving superhuman performance there is growing potential for humans to learn from and collaborate with algorithms," reads a description of the project from U of T's Computational Social Science Lab.

"However, the ways in which AI systems approach problems are often different from the ways people do, and thus may be uninterpretable and hard to learn from."

The researchers say that a crucial step in bridging the gap between human and machine learning styles is to model "the granular actions that constitute human behavior," as opposed to simply matching the aggregate performance of humans.

In other words, they need to teach the robots not only what we know, but how to think like us.

"Chess been described as the 'fruit fly' of AI research," said Cornell professor and study co-author Jon Kleinberg in a news update published by the American univeristy this week.

"Just as geneticists often care less about the fruit fly itself than its role as a model organism, AI researchers love chess, because it's one of their model organisms. Its a self-contained world you can explore, and it illustrates many of the phenomena that we see in AI more broadly."

Excerpt from:

Toronto scientists help create AI-powered bot that can play chess like a human - blogTO

Written by admin

January 31st, 2021 at 8:53 am

Posted in Alphazero

Scientists say dropping acid can help with social anxiety and alcoholism – The Next Web

Posted: at 8:53 am

without comments

What happens when the pandemic finally ends and hundreds of millions of people whove spent an inordinate amount of time secluded are suddenly launched back into the rat race?

Things will likely never go back to normal, but eventually well find a way to occupy space together again and that could be difficult for people whove developed social anxiety or had setbacks in their treatment due to the unique nature of pandemic isolation.

We couldnt find any actual rats to ask how theyre coping with the race, but a team of laboratory mice might just have the answer: its dropping a bunch of acid and letting nature do its thing.

According to a team of researchers from McGill University, LSD (colloquially known as acid) makes people more social and capable of greater human empathy.

The team figured this out by giving lab mice LSD and then measuring their brain activity. The mice became more social while under the influence. And the positive effects of LSD were immediately nullified when the scientists used bursts of light to interrupt the chemical processes thus rendering the mice immediately sober.

The researchers work led to novel insight into how LSD causes a cascade effect of receptor and synapse activity that ultimately seems to kick-start neurotypical feelings of empathy and social inclination.

Due to the nature of the specific chemical reactions concurring in the brain upon the consumption of LSD, it would appear as though its a strong candidate for the potential treatment of myriad mental illnesses and for those with autism spectrum disorder.

Per the teams research paper:

These results indicate that LSD selectively enhances SB by potentiating mPFC excitatory transmission through 5-HT2A/AMPA receptors and mTOR signaling. The activation of 5-HT2A/AMPA/mTORC1 in the mPFC by psychedelic drugs should be explored for the treatment of mental diseases with SB impairments such as autism spectrum disorder and social anxiety disorder.

Quick take: Scientists have understood the effect LSD has on mood receptors in the brain for decades. Whats new here is that we now know how those interactions cause other interactions that create whats essentially a system for increasing empathy or decreasing social anxiety.

Recent research on LSD, cannabis, and psilocybin (shrooms) indicates each has myriad uses for combating and treating mental illness and other disorders related to neurotypical receptor and synapse regulation.

The McGill teams research on LSD, for example, indicates it could prove useful to fight the harmful effects of alcoholism where people are at increased risk of developingsocial anxiety due toaddiction, thus further isolating themselves from others.

This latest study is important in that it drives home what decades of research and millennia of anecdotal evidence already tells us: Some drugs have the potential to do good.

And if we could study them like rational humans instead of allowing politicians to make it almost impossible for researchers to conduct controlled, long term studies on so-called banned substances the world would be a better place.

If you think this is interesting, check out this piece on Neural from earlier today. Where the study in the article youve just read says LSD can amplify empathy and reduce social anxiety, this one shows how empathy happens in a theory of the mind that can be identified down to the single-neuron level.

Read next: Zuckerberg promises Facebook will show less political content from now on

AI has almost solved one of biologys greatest challenges how protein unfolds – ThePrint

Posted: December 14, 2020 at 1:55 am

without comments

Text Size: A- A+

Solving what biologists call the protein-folding problem is a big deal. Proteins are the workhorses of cells and are present in all living organisms. They are made up of long chains of amino acids and are vital for the structure of cells and communication between them as well as regulating all of the chemistry in the body.

This week, the Google-owned artificial intelligence company DeepMind demonstrated a deep-learning program called AlphaFold2, which experts are calling a breakthrough toward solving the grand challenge of protein folding.

Proteins are long chains of amino acids linked together like beads on a string. But for a protein to do its job in the cell, it must fold a process of twisting and bending that transforms the molecule into a complex three-dimensional structure that can interact with its target in the cell. If the folding is disrupted, then the protein wont form the correct shape and it wont be able to perform its job inside the body. This can lead to disease as is the case in a common disease like Alzheimers, and rare ones like cystic fibrosis.

Deep learning is a computational technique that uses the often hidden information contained in vast datasets to solve questions of interest. Its been used widely in fields such as games, speech and voice recognition, autonomous cars, science and medicine.

I believe that tools like AlphaFold2 will help scientists to design new types of proteins, ones that may, for example, help break down plastics and fight future viral pandemics and disease.

I am a computational chemist and author of the book The State of Science. My students and I study the structure and properties of fluorescent proteins using protein-folding computer programs based on classical physics.

After decades of study by thousands of research groups, these protein-folding prediction programs are very good at calculating structural changes that occur when we make small alterations to known molecules.

But they havent adequately managed to predict how proteins fold from scratch. Before deep learning came along, the protein-folding problem seemed impossibly hard, and it seemed poised to frustrate computational chemists for many decades to come.

The sequence of the amino acids which is encoded in DNA defines the proteins 3D shape. The shape determines its function. If the structure of the protein changes, it is unable to perform its function. Correctly predicting protein folds based on the amino acid sequence could revolutionize drug design, and explain the causes of new and old diseases.

All proteins with the same sequence of amino acid building blocks fold into the same three-dimensional form, which optimizes the interactions between the amino acids. They do this within milliseconds, although they have an astronomical number of possible configurations available to them about 10 to the power of 300. This massive number is what makes it hard to predict how a protein folds even when scientists know the full sequence of amino acids that go into making it. Previously predicting the structure of protein from the amino acid sequence was impossible. Protein structures were experimentally determined, a time-consuming and expensive endeavor.

Once researchers can better predict how proteins fold, theyll be able to better understand how cells function and how misfolded proteins cause disease. Better protein prediction tools will also help us design drugs that can target a particular topological region of a protein where chemical reactions take place.

Also read: Diabetics sugar can rise based on how much they think theyre having, Harvard study finds

The success of DeepMinds protein-folding prediction program, called AlphaFold, is not unexpected. Other deep-learning programs written by DeepMind have demolished the worlds best chess, Go and poker players.

In 2016 Stockfish-8, an open-source chess engine, was the worlds computer chess champion. It evaluated 70 million chess positions per second and had centuries of accumulated human chess strategies and decades of computer experience to draw upon. It played efficiently and brutally, mercilessly beating all its human challengers without an ounce of finesse. Enter deep learning.

On Dec. 7, 2017, Googles deep-learning chess program AlphaZero thrashed Stockfish-8. The chess engines played 100 games, with AlphaZero winning 28 and tying 72. It didnt lose a single game. AlphaZero did only 80,000 calculations per second, as opposed to Stockfish-8s 70 million calculations, and it took just four hours to learn chess from scratch by playing against itself a few million times and optimizing its neural networks as it learned from its experience.

AlphaZero didnt learn anything from humans or chess games played by humans. It taught itself and, in the process, derived strategies never seen before. In a commentary in Science magazine, former world chess champion Garry Kasparov wrote that by learning from playing itself, AlphaZero developed strategies that reflect the truth of chess rather than reflecting the priorities and prejudices of the programmers. Its the embodiment of the clich work smarter, not harder.

Every two years, the worlds top computational chemists test the abilities of their programs to predict the folding of proteins and compete in the Critical Assessment of Structure Prediction (CASP) competition.

In the competition, teams are given the linear sequence of amino acids for about 100 proteins for which the 3D shape is known but hasnt yet been published; they then have to compute how these sequences would fold. In 2018 AlphaFold, the deep-learning rookie at the competition, beat all the traditional programs but barely.

Two years later, on Monday, it was announced that Alphafold2 had won the 2020 competition by a healthy margin. It whipped its competitors, and its predictions were comparable to the existing experimental results determined through gold standard techniques like X-ray diffraction crystallography and cryo-electron microscopy. Soon I expect AlphaFold2 and its progeny will be the methods of choice to determine protein structures before resorting to experimental techniques that require painstaking, laborious work on expensive instrumentation.

One of the reasons for AlphaFold2s success is that it could use the Protein Database, which has over 170,000 experimentally determined 3D structures, to train itself to calculate the correctly folded structures of proteins.

The potential impact of AlphaFold can be appreciated if one compares the number of all published protein structures approximately 170,000 with the 180 million DNA and protein sequences deposited in the Universal Protein Database. AlphaFold will help us sort through treasure troves of DNA sequences hunting for new proteins with unique structures and functions.

As with the chess and Go programs AlphaZero and AlphaGo we dont exactly know what the AlphaFold2 algorithm is doing and why it uses certain correlations, but we do know that it works.

Besides helping us predict the structures of important proteins, understanding AlphaFolds thinking will also help us gain new insights into the mechanism of protein folding.

[Deep knowledge, daily. Sign up for The Conversations newsletter.]

One of the most common fears expressed about AI is that it will lead to large-scale unemployment. AlphaFold still has a significant way to go before it can consistently and successfully predict protein folding.

However, once it has matured and the program can simulate protein folding, computational chemists will be integrally involved in improving the programs, trying to understand the underlying correlations used, and applying the program to solve important problems such as the protein misfolding associated with many diseases such as Alzheimers, Parkinsons, cystic fibrosis and Huntingtons disease.

AlphaFold and its offspring will certainly change the way computational chemists work, but it wont make them redundant. Other areas wont be as fortunate. In the past robots were able to replace humans doing manual labor; with AI, our cognitive skills are also being challenged.

Marc Zimmer, Professor of Chemistry, Connecticut College

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Also read: Egg sales have skyrocketed during Covid pandemic and other eggy facts

Subscribe to our channels on YouTube & Telegram

Why news media is in crisis & How you can fix it

India needs free, fair, non-hyphenated and questioning journalism even more as it faces multiple crises.

But the news media is in a crisis of its own. There have been brutal layoffs and pay-cuts. The best of journalism is shrinking, yielding to crude prime-time spectacle.

ThePrint has the finest young reporters, columnists and editors working for it. Sustaining journalism of this quality needs smart and thinking people like you to pay for it. Whether you live in India or overseas, you can do it here.

Support Our Journalism

Read the original here:

AI has almost solved one of biologys greatest challenges how protein unfolds - ThePrint

Written by admin

December 14th, 2020 at 1:55 am

Posted in Alphazero

Facebook AI Introduces ‘ReBeL’: An Algorithm That Generalizes The Paradigm Of Self-Play Reinforcement Learning And Search To Imperfect-Information…

Posted: at 1:55 am

without comments

Most AI systems excel in generating specific responses to a particular problem. Today, AI can outperform humans in various fields. For AI to do any task it is presented with; it needs to generalize, learn, and understand new situations as they occur without supplementary guidance. However, as humans can recognize chess and Poker both as games in the broadest sense, teaching a single AI to play both is challenging.

Perfect-Information games versus Imperfect-Information games

AI systems are relatively successful at mastering perfect-information games like chess, where nothing is hidden to either player. Each player can see the entire board and all possible moves in all instances. With bots like AlphaZero, AI can even combine reinforcement learning with search (RL+Search) to teach themselves to master these games from scratch.

Unlike perfect-information games and single-agent settings, imperfect-information games have a critical challenge that an actions value may depend on their chosen probability. Therefore, the team states that it is also crucial to include the probability that different sequences of actions occurred and not just the sequences of actions alone.

ReBel

Facebook has recently introduced Recursive Belief-based Learning (ReBeL). It is a general RL+Search algorithm that works in all two-player zero-sum games, including imperfect-information games. ReBeL grows on the RL+Search algorithms that have proved successful in perfect-information games. However, unlike past AIs, ReBeL makes decisions by factoring in the probability distribution of different views each player might have about the games current state, which is called a public belief state (PBS). For example, ReBeL can assess the chances that its poker opponent thinks it has.

Former RL+Search algorithms break down in imperfect-information games like Poker, where not complete information is known (for example, players keep their cards secret in Poker). These algorithms give a fixed value to each action regardless of whether the action is chosen. For instance, in chess, a right step is good irrespective of whether it is chosen frequently or rarely. But in games like Poker, the more a player bluffs, its value goes down as opponents can alter their strategy to call more of those bluffs. Thus Pluribus poker bot is trained on an approach that uses search during actual gameplay and not before.

ReBeL can treat imperfect-information games similar to perfect-information games by accounting for the views of each player. Facebook has developed a modified RL+Search algorithm that ReBeL can leverage to work with the higher-dimensional state and action range of imperfect-information games.

Experiments show that ReBeL is efficient in large-scale two-player zero-sum imperfect-information games such as Liars Dice and Poker. ReBeL achieves superhuman performance by even defeating a top human professional in the benchmark game of heads-up no-limit Texas Hold em.

Several works have occurred before to achieve the same. However, ReBeL executes it using considerably less expert domain knowledge than any previous poker AI. This is a crucial step to building a generalized AI that can solve complex real-world problems involving hidden information like negotiations, fraud detection, cybersecurity, etc.

Limitations:

ReBeL is the first AI to empower RL+Search in imperfect-information games. However, there are some limitations to its current implementation, as listed below:

Nevertheless, ReBeL achieves low exploitability in benchmark games and is a significant start toward creating more general AI algorithms. To promote further research, Facebook has open-sourced the implementation of ReBeL for Liars Dice.

GitHub: (For ReBeL for Liars Dice) https://github.com/facebookresearch/rebel?

Source: https://ai.facebook.com/blog/rebel-a-general-game-playing-ai-bot-that-excels-at-poker-and-more

Read the original here:

Facebook AI Introduces 'ReBeL': An Algorithm That Generalizes The Paradigm Of Self-Play Reinforcement Learning And Search To Imperfect-Information...

Written by admin

December 14th, 2020 at 1:55 am

Posted in Alphazero

When 3 is greater than 5 – Chessbase News

Posted: October 22, 2020 at 9:59 pm

without comments

10/18/2020 Star columnist Jon Speelman explores the exchange sacrifice. Speelman shares five illustrative examples to explain in which conditions giving up a rook for a minor piece is a good trade. As a general rule and in fact (almost all?) of the time you need other pieces on the board for an exchange sacrifice to work. | Pictured: Mikhail Tal and Tigran Petrosian following a post-mortem analysis at the 1961 European Team Championship in Oberhausen | Photo: Gerhard Hund

ChessBase 15 - Mega package

Find the right combination! ChessBase 15 program + new Mega Database 2020 with 8 million games and more than 80,000 master analyses. Plus ChessBase Magazine (DVD + magazine) and CB Premium membership for 1 year!

More...

[Note that Jon Speelman also looks at the content of the article in video format, here embedded at the end of the article.]

During the Norway tournament, I streamed commentarya couple of times myself at twitch.tv/jonspeelman, but mainly listened to the official commentaryby Vladimir Kramnik and Judit Polgar.

Both were very interesting, and Kramnik in particular has a chess aesthetic which I very much like. In his prime a powerhouse positional player with superb endgame technique, he started life much more tactically and his instinct is to sacrifice for the initiative whenever possible, especially the exchange: an approach which, after defence seemed to triumph under traditional chess engines, has been given a new lease of life by Alpha Zero.

So I thought today that Id look at some nice exchange sacrifices, but first a moment from Norway where I was actually a tad disappointed by a winning sacrifice.

At the end of a beautiful positional game, which has been annotated here in Game of the Week, Carlsen finished off with the powerful

42.Re8!

and after

42...Qxe8 43.Qh6+ Kg8 44.Qxg6+ Kh8 45.Nf6

Tari resigned

Of course, I would have played Re8 myself in a game if Id seen it, but I was hoping from an aesthetic perspective that Carlsen would complete this real masterclass and masterpiece with a nice zugzwang.

You start with c4 to prevent 42.f3 c4, creating some very slight confusion and then it goes:

42.c4 Kg8 43.f3

And for example: 43...Qd7 44.Qh6 Qe6 45.Kg3 fxe4 (45...Rg7 46.Nf6+ Kf7 47.Qh8 Qe7 48.Kg2) 46.dxe4 Rf4 47.Nxf4 exf4+ 48.Kxf4 Qf7+ 49.Kg3 Qg7 50.Qxg7+ Kxg7 51.Rxf8

Black can also try43...Rh7

and here after 44.Rxf8+ Kg7

as the engine pointed out to me, its best to use the Re8 trick:

45.Qxh7+! (45.Rf6 is much messier) 45...Kxh7 46.Re8!

Mega Database 2020

The ChessBase Mega Database 2020 is the premiere chess database with over eight million games from 1560 to 2019 in high quality. Packing more than 85,000 annotated games, Mega 2020 contains the worlds largest collection of high-class analysed games. Train like a pro! Prepare for your opponents with ChessBase and the Mega Database 2020. Let grandmasters explain how to best handle your favorite variations, improve your repertoire and much more.

The black queen is trapped.

For todays examples I used my memory and the ChessBase search mask when I couldnt track down a game exactly. For instance,for the first one byBotvinnik [pictured], I set him as Black with 0-1, disabled ignoring colours, and put Rd4 e5 c5 on the board which turned out to identify the single game I wanted a hole in 1!I also asked my stream on Thursday for any examples, and one of my stalwarts, a Scottish Frenchman, found me Reshevsky v Petrosian (I couldnt remember offhand who Petrosians opponent was) and drew my attention to the beautiful double exchange sacrifice by Erwin L'Ami from Wijk aan Zee B.

Before the games themselves, which are in chronological order,it might be worthwhile to consider what makes an exchange sacrifice successful. Whole books have been written on this and Im certainly not going to be able to go into serious detail. But a couple of points:

The need for extra pieces applies particularly to endgames. For instance,this diagram should definitely be lost for Black:

Its far from trivial, but as a general schema the white king should be able to advance right into Blacks guts and then White can do things with his pawns. Something like get Ke7 and Rf6, then g4 exchanging pawns if Black has played ...h5. Play f5, move the rook, play f6+, and arrange to play Rxf7.

But if you add a pair of rooks then it becomes enormously difficult. And indeed I really dont know whether God would beat God.

Select an entry from the list to switch between games

Master Class Vol.11: Vladimir Kramnik

This DVD allows you to learn from the example of one of the best players in the history of chess and from the explanations of the authors (Pelletier, Marin, Mller and Reeh) how to successfully organise your games strategically, consequently how to keep y

When 3 is greater than 5 - Chessbase News

Written by admin

October 22nd, 2020 at 9:59 pm

Posted in Alphazero

AlphaZero – Wikipedia

Posted: October 17, 2020 at 10:54 am

without comments

Game-playing artificial intelligence

AlphaZero is a computer program developed by artificial intelligence research company DeepMind to master the games of chess, shogi and go. This algorithm uses an approach similar to AlphaGo Zero.

On December 5, 2017, the DeepMind team released a preprint introducing AlphaZero, which within 24 hours of training achieved a superhuman level of play in these three games by defeating world-champion programs Stockfish, elmo, and the 3-day version of AlphaGo Zero. In each case it made use of custom tensor processing units (TPUs) that the Google programs were optimized to use.[1] AlphaZero was trained solely via "self-play" using 5,000 first-generation TPUs to generate the games and 64 second-generation TPUs to train the neural networks, all in parallel, with no access to opening books or endgame tables. After four hours of training, DeepMind estimated AlphaZero was playing at a higher Elo rating than Stockfish 8; after 9 hours of training, the algorithm defeated Stockfish 8 in a time-controlled 100-game tournament (28 wins, 0 losses, and 72 draws).[1][2][3] The trained algorithm played on a single machine with four TPUs.

DeepMind's paper on AlphaZero was published in the journal Science on 7 December 2018.[4] In 2019 DeepMind published a new paper detailing MuZero, a new algorithm able to generalise on AlphaZero work playing both Atari and board games without knowledge of the rules or representations of the game.[5]

AlphaZero (AZ) is a more generalized variant of the AlphaGo Zero (AGZ) algorithm, and is able to play shogi and chess as well as Go. Differences between AZ and AGZ include:[1]

Comparing Monte Carlo tree search searches, AlphaZero searches just 80,000 positions per second in chess and 40,000 in shogi, compared to 70 million for Stockfish and 35 million for elmo. AlphaZero compensates for the lower number of evaluations by using its deep neural network to focus much more selectively on the most promising variation.[1]

AlphaZero was trained solely via self-play, using 5,000 first-generation TPUs to generate the games and 64 second-generation TPUs to train the neural networks. In parallel, the in-training AlphaZero was periodically matched against its benchmark (Stockfish, elmo, or AlphaGo Zero) in brief one-second-per-move games to determine how well the training was progressing. DeepMind judged that AlphaZero's performance exceeded the benchmark after around four hours of training for Stockfish, two hours for elmo, and eight hours for AlphaGo Zero.[1]

In AlphaZero's chess match against Stockfish 8 (2016 TCEC world champion), each program was given one minute per move. Stockfish was allocated 64 threads and a hash size of 1 GB,[1] a setting that Stockfish's Tord Romstad later criticized as suboptimal.[6][note 1] AlphaZero was trained on chess for a total of nine hours before the match. During the match, AlphaZero ran on a single machine with four application-specific TPUs. In 100 games from the normal starting position, AlphaZero won 25 games as White, won 3 as Black, and drew the remaining 72.[8] In a series of twelve, 100-game matches (of unspecified time or resource constraints) against Stockfish starting from the 12 most popular human openings, AlphaZero won 290, drew 886 and lost 24.[1]

AlphaZero was trained on shogi for a total of two hours before the tournament. In 100 shogi games against elmo (World Computer Shogi Championship 27 summer 2017 tournament version with YaneuraOu 4.73 search), AlphaZero won 90 times, lost 8 times and drew twice.[8] As in the chess games, each program got one minute per move, and elmo was given 64 threads and a hash size of 1GB.[1]

After 34 hours of self-learning of Go and against AlphaGo Zero, AlphaZero won 60 games and lost 40.[1][8]

DeepMind stated in its preprint, "The game of chess represented the pinnacle of AI research over several decades. State-of-the-art programs are based on powerful engines that search many millions of positions, leveraging handcrafted domain expertise and sophisticated domain adaptations. AlphaZero is a generic reinforcement learning algorithm originally devised for the game of go that achieved superior results within a few hours, searching a thousand times fewer positions, given no domain knowledge except the rules."[1] DeepMind's Demis Hassabis, a chess player himself, called AlphaZero's play style "alien": It sometimes wins by offering counterintuitive sacrifices, like offering up a queen and bishop to exploit a positional advantage. "It's like chess from another dimension."[9]

Given the difficulty in chess of forcing a win against a strong opponent, the +28 0 =72 result is a significant margin of victory. However, some grandmasters, such as Hikaru Nakamura and Komodo developer Larry Kaufman, downplayed AlphaZero's victory, arguing that the match would have been closer if the programs had access to an opening database (since Stockfish was optimized for that scenario).[10] Romstad additionally pointed out that Stockfish is not optimized for rigidly fixed-time moves and the version used is a year old.[6][11]

Similarly, some shogi observers argued that the elmo hash size was too low, that the resignation settings and the "EnteringKingRule" settings (cf. shogi Entering King) may have been inappropriate, and that elmo is already obsolete compared with newer programs.[12][13]

Papers headlined that the chess training took only four hours: "It was managed in little more than the time between breakfast and lunch."[2][14]Wired hyped AlphaZero as "the first multi-skilled AI board-game champ".[15] AI expert Joanna Bryson noted that Google's "knack for good publicity" was putting it in a strong position against challengers. "It's not only about hiring the best programmers. It's also very political, as it helps make Google as strong as possible when negotiating with governments and regulators looking at the AI sector."[8]

Human chess grandmasters generally expressed excitement about AlphaZero. Danish grandmaster Peter Heine Nielsen likened AlphaZero's play to that of a superior alien species.[8] Norwegian grandmaster Jon Ludvig Hammer characterized AlphaZero's play as "insane attacking chess" with profound positional understanding.[2] Former champion Garry Kasparov said "It's a remarkable achievement, even if we should have expected it after AlphaGo."[10][16]

Grandmaster Hikaru Nakamura was less impressed, and stated "I don't necessarily put a lot of credibility in the results simply because my understanding is that AlphaZero is basically using the Google supercomputer and Stockfish doesn't run on that hardware; Stockfish was basically running on what would be my laptop. If you wanna have a match that's comparable you have to have Stockfish running on a supercomputer as well."[7]

Top US correspondence chess player Wolff Morrow was also unimpressed, claiming that AlphaZero would probably not make the semifinals of a fair competition such as TCEC where all engines play on equal hardware. Morrow further stated that although he might not be able to beat AlphaZero if AlphaZero played drawish openings such as the Petroff Defence, AlphaZero would not be able to beat him in a correspondence chess game either.[17]

Motohiro Isozaki, the author of YaneuraOu, noted that although AlphaZero did comprehensively beat elmo, the rating of AlphaZero in shogi stopped growing at a point which is at most 100~200 higher than elmo. This gap is not that high, and elmo and other shogi software should be able to catch up in 12 years.[18]

DeepMind addressed many of the criticisms in their final version of the paper, published in December 2018 in Science.[4] They further clarified that AlphaZero was not running on a supercomputer; it was trained using 5,000 tensor processing units (TPUs), but only ran on four TPUs and a 44-core CPU in its matches.[19]

In the final results, Stockfish version 8 ran under the same conditions as in the TCEC superfinal: 44 CPU cores, Syzygy endgame tablebases, and a 32GB hash size. Instead of a fixed time control of one move per minute, both engines were given 3 hours plus 15 seconds per move to finish the game. In a 1000-game match, AlphaZero won with a score of 155 wins, 6 losses, and 839 draws. DeepMind also played a series of games using the TCEC opening positions; AlphaZero also won convincingly.

Similar to Stockfish, Elmo ran under the same conditions as in the 2017 CSA championship. The version of Elmo used was WCSC27 in combination with YaneuraOu 2017 Early KPPT 4.79 64AVX2 TOURNAMENT. Elmo operated on the same hardware as Stockfish: 44 CPU cores and a 32GB hash size. AlphaZero won 98.2% of games when playing black (which plays first in shogi) and 91.2% overall.

Human grandmasters were generally impressed with AlphaZero's games against Stockfish.[20] Former world champion Garry Kasparov said it was a pleasure to watch AlphaZero play, especially since its style was open and dynamic like his own.[21][22]

In the chess community, Komodo developer Mark Lefler called it a "pretty amazing achievement", but also pointed out that the data was old, since Stockfish had gained a lot of strength since January 2018 (when Stockfish 8 was released). Fellow developer Larry Kaufman said AlphaZero would probably lose a match against the latest version of Stockfish, Stockfish 10, under Top Chess Engine Championship (TCEC) conditions. Kaufman argued that the only advantage of neural networkbased engines was that they used a GPU, so if there was no regard for power consumption (e.g. in an equal-hardware contest where both engines had access to the same CPU and GPU) then anything the GPU achieved was "free". Based on this, he stated that the strongest engine was likely to be a hybrid with neural networks and standard alphabeta search.[23]

AlphaZero inspired the computer chess community to develop Leela Chess Zero, using the same techniques as AlphaZero. Leela contested several championships against Stockfish, where it showed similar strength.[24]

In 2019 DeepMind published MuZero, a unified system that played excellent chess, shogi, and go, as well as games in the Atari Learning Environment, without being pre-programmed with their rules.[25][26]

The match results by themselves are not particularly meaningful because of the rather strange choice of time controls and Stockfish parameter settings: The games were played at a fixed time of 1 minute/move, which means that Stockfish has no use of its time management heuristics (lot of effort has been put into making Stockfish identify critical points in the game and decide when to spend some extra time on a move; at a fixed time per move, the strength will suffer significantly). The version of Stockfish used is one year old, was playing with far more search threads than has ever received any significant amount of testing, and had way too small hash tables for the number of threads. I believe the percentage of draws would have been much higher in a match with more normal conditions.[7]

Link:

AlphaZero - Wikipedia

Written by admin

October 17th, 2020 at 10:54 am

Posted in Alphazero

AlphaZero: Shedding new light on chess, shogi, and Go …

Posted: at 10:54 am

without comments

As with Go, we are excited about AlphaZeros creative response to chess, which has been a grand challenge for artificial intelligence since the dawn of the computing age with early pioneers including Babbage, Turing, Shannon, and von Neumann all trying their hand at designing chess programs. But AlphaZero is about more than chess, shogi or Go. To create intelligent systems capable of solving a wide range of real-world problems we need them to be flexible and generalise to new situations. While there has been some progress towards this goal, it remains a major challenge in AI research with systems capable of mastering specific skills to a very high standard, but often failing when presented with even slightly modified tasks.

AlphaZeros ability to master three different complex games and potentially any perfect information game is an important step towards overcoming this problem. It demonstrates that a single algorithm can learn how to discover new knowledge in a range of settings. And, while it is still early days, AlphaZeros creative insights coupled with the encouraging results we see in other projects such as AlphaFold, give us confidence in our mission to create general purpose learning systems that will one day help us find novel solutions to some of the most important and complex scientific problems.

This work was done by David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, and Demis Hassabis.

Follow this link:

AlphaZero: Shedding new light on chess, shogi, and Go ...

Written by admin

October 17th, 2020 at 10:54 am

Posted in Alphazero

AlphaGo Zero – Wikipedia

Posted: at 10:54 am

without comments

Artificial intelligence that plays Go

AlphaGo Zero is a version of DeepMind's Go software AlphaGo. AlphaGo's team published an article in the journal Nature on 19 October 2017, introducing AlphaGo Zero, a version created without using data from human games, and stronger than any previous version.[1] By playing games against itself, AlphaGo Zero surpassed the strength of AlphaGo Lee in three days by winning 100 games to 0, reached the level of AlphaGo Master in 21 days, and exceeded all the old versions in 40 days.[2]

Training artificial intelligence (AI) without datasets derived from human experts has significant implications for the development of AI with superhuman skills because expert data is "often expensive, unreliable or simply unavailable."[3]Demis Hassabis, the co-founder and CEO of DeepMind, said that AlphaGo Zero was so powerful because it was "no longer constrained by the limits of human knowledge".[4]David Silver, one of the first authors of DeepMind's papers published in Nature on AlphaGo, said that it is possible to have generalised AI algorithms by removing the need to learn from humans.[5]

Google later developed AlphaZero, a generalized version of AlphaGo Zero that could play chess and Shgi in addition to Go. In December 2017, AlphaZero beat the 3-day version of AlphaGo Zero by winning 60 games to 40, and with 8 hours of training it outperformed AlphaGo Lee on an Elo scale. AlphaZero also defeated a top chess program (Stockfish) and a top Shgi program (Elmo).[6][7]

AlphaGo Zero's neural network was trained using TensorFlow, with 64 GPU workers and 19 CPU parameter servers. Only four TPUs were used for inference. The neural network initially knew nothing about Go beyond the rules. Unlike earlier versions of AlphaGo, Zero only perceived the board's stones, rather than having some rare human-programmed edge cases to help recognize unusual Go board positions. The AI engaged in reinforcement learning, playing against itself until it could anticipate its own moves and how those moves would affect the game's outcome.[8] In the first three days AlphaGo Zero played 4.9 million games against itself in quick succession.[9] It appeared to develop the skills required to beat top humans within just a few days, whereas the earlier AlphaGo took months of training to achieve the same level.[10]

For comparison, the researchers also trained a version of AlphaGo Zero using human games, AlphaGo Master, and found that it learned more quickly, but actually performed more poorly in the long run.[11] DeepMind submitted its initial findings in a paper to Nature in April 2017, which was then published in October 2017.[1]

The hardware cost for a single AlphaGo Zero system in 2017, including the four TPUs, has been quoted as around $25 million.[12]

According to Hassabis, AlphaGo's algorithms are likely to be of the most benefit to domains that require an intelligent search through an enormous space of possibilities, such as protein folding or accurately simulating chemical reactions.[13] AlphaGo's techniques are probably less useful in domains that are difficult to simulate, such as learning how to drive a car.[14] DeepMind stated in October 2017 that it had already started active work on attempting to use AlphaGo Zero technology for protein folding, and stated it would soon publish new findings.[15][16]

AlphaGo Zero was widely regarded as a significant advance, even when compared with its groundbreaking predecessor, AlphaGo. Oren Etzioni of the Allen Institute for Artificial Intelligence called AlphaGo Zero "a very impressive technical result" in "both their ability to do itand their ability to train the system in 40 days, on four TPUs".[8]The Guardian called it a "major breakthrough for artificial intelligence", citing Eleni Vasilaki of Sheffield University and Tom Mitchell of Carnegie Mellon University, who called it an impressive feat and an outstanding engineering accomplishment" respectively.[14]Mark Pesce of the University of Sydney called AlphaGo Zero "a big technological advance" taking us into "undiscovered territory".[17]

Gary Marcus, a psychologist at New York University, has cautioned that for all we know, AlphaGo may contain "implicit knowledge that the programmers have about how to construct machines to play problems like Go" and will need to be tested in other domains before being sure that its base architecture is effective at much more than playing Go. In contrast, DeepMind is "confident that this approach is generalisable to a large number of domains".[9]

In response to the reports, South Korean Go professional Lee Sedol said, "The previous version of AlphaGo wasnt perfect, and I believe thats why AlphaGo Zero was made." On the potential for AlphaGo's development, Lee said he will have to wait and see but also said it will affect young Go players. Mok Jin-seok, who directs the South Korean national Go team, said the Go world has already been imitating the playing styles of previous versions of AlphaGo and creating new ideas from them, and he is hopeful that new ideas will come out from AlphaGo Zero. Mok also added that general trends in the Go world are now being influenced by AlphaGos playing style. "At first, it was hard to understand and I almost felt like I was playing against an alien. However, having had a great amount of experience, Ive become used to it," Mok said. "We are now past the point where we debate the gap between the capability of AlphaGo and humans. Its now between computers." Mok has reportedly already begun analyzing the playing style of AlphaGo Zero along with players from the national team. "Though having watched only a few matches, we received the impression that AlphaGo Zero plays more like a human than its predecessors," Mok said.[18] Chinese Go professional, Ke Jie commented on the remarkable accomplishments of the new program: "A pure self-learning AlphaGo is the strongest. Humans seem redundant in front of its self-improvement."[19]

Future of Go Summit

89:11 against AlphaGo Master

On 5 December 2017, DeepMind team released a preprint on arXiv, introducing AlphaZero, a program using generalized AlphaGo Zero's approach, which achieved within 24 hours a superhuman level of play in chess, shogi, and Go, defeating world-champion programs, Stockfish, Elmo, and 3-day version of AlphaGo Zero in each case.[6]

AlphaZero (AZ) is a more generalized variant of the AlphaGo Zero (AGZ) algorithm, and is able to play shogi and chess as well as Go. Differences between AZ and AGZ include:[6]

An open source program, Leela Zero, based on the ideas from the AlphaGo papers is available. It uses a GPU instead of the TPUs recent versions of AlphaGo rely on.

Link:

AlphaGo Zero - Wikipedia

Written by admin

October 17th, 2020 at 10:54 am

Posted in Alphazero

AlphaZero Crushes Stockfish In New 1,000-Game Match …

Posted: at 10:54 am

without comments

In news reminiscent of the initial AlphaZero shockwave last December, the artificial intelligence company DeepMind released astounding results from an updated version of the machine-learning chess project today.

The results leave no question, once again, that AlphaZero plays some of the strongest chess in the world.

The updated AlphaZero crushed Stockfish 8 in a new 1,000-game match, scoring +155 -6 =839. (See below for three sample games from this match with analysis by Stockfish 10 and video analysis by GM Robert Hess.)

AlphaZero also bested Stockfish in a series of time-odds matches, soundly beating the traditional engine even at time odds of 10 to one.

In additional matches, the new AlphaZero beat the"latest development version" of Stockfish, with virtually identical results as the match vs Stockfish 8, according to DeepMind. The pre-release copy of journal article, which is dated Dec. 7, 2018, does not specify the exact development version used.

[Update: Today's release of the full journal article specifies that the match was against the latest development version of Stockfish as of Jan. 13, 2018, which was Stockfish 9.]

The machine-learning engine also won all matches against "a variant of Stockfish that uses a strong opening book," according to DeepMind. Adding the opening book did seem to help Stockfish, which finally won a substantial number of games when AlphaZero was Blackbut not enough to win the match.

AlphaZero's results (wins green, losses red) vs the latest Stockfish and vs Stockfish with a strong opening book. Image by DeepMind via Science.

The results will be published in an upcoming article by DeepMind researchers in the journal Scienceand were provided to selected chess media by DeepMind, which is based in London and owned by Alphabet, the parent company of Google.

The 1,000-game match was played in early 2018. In the match, both AlphaZero and Stockfish were given three hours each game plus a 15-second increment per move. This time control would seem to make obsolete one of the biggest arguments against the impact of last year's match, namely that the 2017 time control of one minute per move played to Stockfish's disadvantage.

With three hours plus the 15-second increment, no such argument can be made, as that is an enormous amount of playing time for any computer engine. In the time odds games, AlphaZero was dominant up to 10-to-1 odds. Stockfish only began to outscore AlphaZero when the odds reached 30-to-1.

AlphaZero's results (wins green, losses red) vs Stockfish 8 in time odds matches. Image by DeepMind via Science.

AlphaZero's results in the time odds matches suggest it is not only much stronger than any traditional chess engine, but that it also uses a much more efficient search for moves. According to DeepMind, AlphaZero uses a Monte Carlo tree search, and examines about 60,000 positions per second, compared to 60 million for Stockfish.

An illustration of how AlphaZero searches for chess moves. Image by DeepMind via Science.

What can computer chess fans conclude after reading these results? AlphaZero has solidified its status as one of the elite chess players in the world. But the results are even more intriguing if you're following the ability of artificial intelligence to master general gameplay.

According to the journal article, the updated AlphaZero algorithm is identical in three challenging games: chess, shogi, and go. This version of AlphaZero was able to beat the top computer players of all three games after just a few hours of self-training, starting from just the basic rules of the games.

The updated AlphaZero results come exactly one year to the day since DeepMind unveiled the first, historic AlphaZero results in a surprise match vs Stockfish that changed chess forever.

Since then, an open-source project called Lc0 has attempted to replicate the success of AlphaZero, and the project has fascinated chess fans. Lc0 now competes along with the champion Stockfish and the rest of the world's top engines in the ongoing Chess.com Computer Chess Championship.

CCC fans will be pleased to see that some of the new AlphaZero games include "fawn pawns," the CCC-chat nickname for lone advanced pawns that cramp an opponent's position. Perhaps the establishment of these pawns is a critical winning strategy, as it seems AlphaZero and Lc0 have independently learned it.

DeepMind released 20 sample games chosen by GM Matthew Sadler from the 1,000 game match. Chess.com has selected three of these games with deep analysis by Stockfish 10 and video analysis by GM Robert Hess. You can download the 20 sample games at the bottom of this article, analyzed by Stockfish 10, and four sample games analyzed by Lc0.

Update: After this article was published, DeepMind released 210 sample games that you can download here.

Selected game 1 with analysis by Stockfish 10:

Game 1 video analysis by GM Robert Hess:

Selected game 2with analysis by Stockfish 10:

Game 2 video analysis by GM Robert Hess:

Selected game 3 with analysis by Stockfish 10:

Game 3 video analysis by GM Robert Hess:

IM Anna Rudolf also made a video analysis of one of the sample games, calling it "AlphaZero's brilliancy."

The new version of AlphaZero trained itself to play chess starting just from the rules of the game, using machine-learning techniques to continually update its neural networks. According to DeepMind, 5,000 TPUs (Google's tensor processing unit, an application-specific integrated circuit for article intelligence) were used to generate the first set of self-play games, and then 16 TPUs were used to train the neural networks.

The total training time in chess was nine hours from scratch. According to DeepMind, it took the new AlphaZero just four hours of training to surpass Stockfish; by nine hours it was far ahead of the world-champion engine.

For the games themselves, Stockfish used 44 CPU (central processing unit) cores and AlphaZero used a single machine with four TPUs and 44 CPU cores. Stockfish had a hash size of 32GB and used syzygy endgame tablebases.

AlphaZero's results vs. Stockfish in the most popular human openings. In the left bar, AlphaZero plays White; in the right bar, AlphaZero is Black. Image by DeepMind via Science. Click on the image for a larger version.

The sample games released were deemed impressive by chess professionals who were given preview access to them. GM Robert Hess categorized the games as "immensely complicated."

DeepMind itself noted the unique style of its creation in the journal article:

"In several games, AlphaZero sacrificed pieces for long-term strategic advantage, suggesting that it has a more fluid, context-dependent positional evaluation than the rule-based evaluations used by previous chess programs," the DeepMind researchers said.

The AI company also emphasized the importance of using the same AlphaZero version in three different games, touting it as a breakthrough in overall game-playing intelligence:

"These results bring us a step closer to fulfilling a longstanding ambition of artificial intelligence: a general game-playing system that can learn to master any game," the DeepMind researchers said.

You can download the 20 sample games provided by DeepMind and analyzed by Chess.com using Stockfish 10 on a powerful computer. The first set of games contains 10 games with no opening book, and the second set contains games with openings from the 2016 TCEC (Top Chess Engine Championship).

PGN downloads:

20 games with analysis by Stockfish 10:

4 selected games with analysis by Lc0:

Love AlphaZero? You can watch the machine-learning chess project it inspired, Lc0, in the ongoing Computer Chess Championship now.

Read the rest here:

AlphaZero Crushes Stockfish In New 1,000-Game Match ...

Written by admin

October 17th, 2020 at 10:54 am

Posted in Alphazero

ACM Prize in Computing Awarded to AlphaGo Developer – HPCwire

Posted: April 6, 2020 at 5:57 pm

without comments

NEW YORK, April 1, 2020 ACM, the Association for Computing Machinery, announced that David Silver is the recipient of the 2019 ACM Prize in Computing for breakthrough advances in computer game-playing. Silver is a Professor at University College London and a Principal Research Scientist at DeepMind, a Google-owned artificial intelligence company based in the United Kingdom. Silver is recognized as a central figure in the growing and impactful area of deep reinforcement learning.

Silvers most highly publicized achievement was leading the team that developed AlphaGo, a computer program that defeated the world champion of the game Go, a popular abstract board game. Silver developed the AlphaGo algorithm by deftly combining ideas from deep-learning, reinforcement-learning, traditional tree-search and large-scale computing. AlphaGo is recognized as a milestone in artificial intelligence (AI) research and was ranked byNew Scientistmagazine as one of the top 10 discoveries of the last decade.

AlphaGo was initialized by training on expert human games followed by reinforcement learning to improve its performance. Subsequently, Silver sought even more principled methods for achieving greater performance and generality. He developed the AlphaZero algorithm that learned entirely by playing games against itself, starting without any human data or prior knowledge except the game rules. AlphaZero achieved superhuman performance in the games of chess, Shogi, and Go, demonstrating unprecedented generality of the game-playing methods.

The ACM Prize in Computing recognizes early-to-mid-career computer scientists whose research contributions have fundamental impact and broad implications. The award carries a prize of $250,000, from an endowment provided by Infosys Ltd. Silver will formally receive the ACM Prize at ACMs annual awards banquet on June 20, 2020 in San Francisco.

Computer Game-Playing and AI Teaching computer programs to play games, against humans or other computers, has been a central practice in AI research since the 1950s. Game playing, which requires an agent to make a series of decisions toward an objectivewinningis seen as a useful facsimile of human thought processes. Game-playing also affords researchers results that are easily quantifiablethat is, did the computer follow the rules, score points, and/or win the game?

At the dawn of the field, researchers developed programs to compete with humans at checkers, and over the decades, increasingly sophisticated chess programs were introduced. A watershed moment occurred in 1997, when ACM sponsored a tournament in which IBMs DeepBlue became the first computer to defeat a world chess champion, Gary Kasparov. At the same time, the objective of the researchers was not simply to develop programs to win games, but to use game-playing as a touchstone to develop machines with capacities that simulated human intelligence.

Few other researchers have generated as much excitement in the AI field as David Silver, said ACM President Cherri M. Pancake. Human vs. machine contests have long been a yardstick for AI. Millions ofpeople around the world watched as AlphaGo defeated the Go world champion, Lee Sedol, on television in March 2016. But that was just the beginning of Silvers impact. His insights into deep reinforcement learning are already being applied in areas such as improving the efficiency of the UKs power grid, reducing power consumption at Googles data centers, and planning the trajectories of space probes for the European Space Agency.

Infosys congratulates David Silver for his accomplishments in making foundational contributions to deep reinforcement learning and thus rapidly accelerating the state of the art in artificial intelligence, said Pravin Rao, COO of Infosys. When computers can defeat world champions at complex board games, it captures the public imagination and attracts young researchers to areas like machine learning. Importantly, the frameworks that Silver and his colleagues have developed will inform all areas of AI, as well as practical applications in business and industry for many years to come. Infosys is proud to provide financial support for the ACM Prize in Computing and to join with ACM in recognizing outstanding young computing professionals.

Silver is credited with being one of the foremost proponents of a new machine learning tool called deep reinforcement learning, in which the algorithm learns by trial-and-error in an interactive environment. The algorithm continually adjusts its actions based on the information it accumulates while it is running. In deep reinforcement learning, artificial neural networkscomputation models which use different layers of mathematical processingare effectively combined with the reinforcement learning strategies to evaluate the trial-and-error results. Instead of having to perform calculations of every possible outcome, the algorithm makes predictions leading to a more efficient execution of a given task.

Learning Atari from Scratch At the Neural Information Processing Systems Conference (NeurIPS) in 2013, Silver and his colleagues at DeepMind presented a program that could play 50 Atari games to human-level ability. The program learned to play the games based solely on observing the pixels and scores while playing. Earlier reinforcement learning approaches had not achieved anything close to this level of ability.

Silver and his colleagues published their method of combining reinforcement learning with artificial neural networks in a seminal 2015 paper, Human Level Control Through Deep Reinforcement Learning, which was published inNature. The paper has been cited nearly 10,000 times and has had an immense impact on the field. Subsequently, Silver and his colleagues continued to refine these deep reinforcement learning algorithms with novel techniques, and these algorithms remain among the most widely-used tools in machine learning.

AlphaGo The game of Go was invented in China 2,500 years ago and has remained popular, especially in Asia. Go is regarded as far more complex than chess, as there are vastly more potential moves a player can make, as well as many more ways a game can play out. Silver first began exploring the possibility of developing a computer program that could master Go when he was a PhD student at the University of Alberta, and it remained a continuing research interest.

Silvers key insight in developing AlphaGo was to combine deep neural networks with an algorithm used in computer game-playing called Monte Carlo Tree Search. One strength of Monte Carlo Tree Search is that, while pursuing the perceived best strategy in a game, the algorithm is also continually investigating other alternatives. AlphaGos defeat of world Go champion Lee Sedol in March 2016 was hailed as a milestone moment in AI. Silver and his colleagues published the foundational technology underpinning AlphaGo in the paper Mastering the Game of Go with Deep Neural Networks and Tree Search that was published inNaturein 2016.

AlphaGo Zero, AlphaZero and AlphaStar Silver and his team at DeepMind have continued to develop new algorithms that have significantly advanced the state of the art in computer game-playing and achieved results many in the field thought were not yet possible for AI systems. In developing the AlphaGo Zero algorithm, Silver and his collaborators demonstrated that it is possible for a program to master Go without any access to human expert games. The algorithm learns entirely by playing itself without any human data or prior knowledge, except the rules of the game and, in a further iteration, without even knowing the rules.

Later, the DeepMind teams AlphaZero also achieved superhuman performance in chess, Shogi, and Go. In chess, AlphaZero easily defeated world computer chess champion Stockfish, a high-performance program designed by grandmasters and chess programming experts. Just last year, the DeepMind team, led by Silver, developed AlphaStar, which mastered the multiple-player video game StarCraft II, which had been regarded as a stunningly hard challenge for AI learning systems.

The DeepMind team continues to advance these technologies and find applications for them. Among other initiatives, Google is exploring how to use deep reinforcement learning approaches to manage robotic machinery at factories.

Biographical Background David Silver is Lead of the Reinforcement Learning Research Group at DeepMind, and a Professor of Computer Science at University College London. DeepMind, a subsidiary of Google, seeks to combine the best techniques from machine learning and systems neuroscience to build powerful general-purpose learning algorithms.

Silver earned Bachelors and Masters degrees from Cambridge University in 1997 and 2000, respectively. In 1998 he co-founded the video games company Elixir Studios, where he served as Chief Technology Officer and Lead Programmer. Silver returned to academia and earned a PhD in Computer Science from the University of Alberta in 2009. Silvers numerous honors include the Marvin Minksy Medal (2018) for outstanding achievements in artificial intelligence, the Royal Academy of Engineering Silver Medal (2017) for outstanding contribution to UK engineering, and the Mensa Foundation Prize (2017) for best scientific discovery in the field of artificial intelligence.

About the ACM Prize in Computing

TheACM Prize in Computingrecognizes an early- to mid-career fundamental innovative contribution in computing that, through its depth, impact and broad implications, exemplifies the greatest achievements in the discipline. The award carries a prize of $250,000. Financial support is provided by an endowment from Infosys Ltd. The ACM Prize in Computing was previously known as the ACM-Infosys Foundation Award in the Computing Sciences from 2007 through 2015. ACM Prize recipients are invited to participate in the Heidelberg Laureate Forum, an annual networking event that brings together young researchers from around the world with recipients of the ACM A.M. Turing Award (computer science), the Abel Prize (mathematics), the Fields Medal (mathematics), and the Nevanlinna Prize (mathematics).

About ACM

ACM, the Association for Computing Machinery, is the worlds largest educational and scientific computing society, uniting educators, researchers and professionals to inspire dialogue, share resources and address the fields challenges. ACM strengthens the computing professions collective voice through strong leadership, promotion of the highest standards, and recognition of technical excellence. ACM supports the professional growth of its members by providing opportunities for life-long learning, career development, and professional networking.

About Infosys

Infosysis a global leader in next-generation digital services and consulting. We enable clients in 46 countries to navigate their digital transformation. With over three decades of experience in managing the systems and workings of global enterprises, we expertly steer our clients through their digital journey. We do it by enabling the enterprise with an AI-powered core that helps prioritize the execution of change. We also empower the business with agile digital at scale to deliver unprecedented levels of performance and customer delight. Our always-on learning agenda drives their continuous improvement through building and transferring digital skills, expertise, and ideas from our innovation ecosystem.

Source: ACM

Read the rest here:

ACM Prize in Computing Awarded to AlphaGo Developer - HPCwire

Written by admin

April 6th, 2020 at 5:57 pm

Posted in Alphazero

Page 2123 4

Archive for the ‘Alphazero’ Category

Toronto scientists help create AI-powered bot that can play chess like a human – blogTO

Scientists say dropping acid can help with social anxiety and alcoholism – The Next Web

AI has almost solved one of biologys greatest challenges how protein unfolds – ThePrint

Facebook AI Introduces ‘ReBeL’: An Algorithm That Generalizes The Paradigm Of Self-Play Reinforcement Learning And Search To Imperfect-Information…

When 3 is greater than 5 – Chessbase News

AlphaZero – Wikipedia

AlphaZero: Shedding new light on chess, shogi, and Go …

AlphaGo Zero – Wikipedia

AlphaZero Crushes Stockfish In New 1,000-Game Match …

ACM Prize in Computing Awarded to AlphaGo Developer – HPCwire

Pages

Categories

Partners

Recommended Resources

Archives