这期会员通讯主要围绕三个话题:首先,当机器学会「读书」之后,它们是否能给《纽约书评》杂志写一篇书评?倘若是肯定的回答,那么书评人的角色未来会如何变化?
其次,欧洲人工智能产业长期以来被「中美争霸」的论调所遮掩,但事实上,这片大陆正发生着巨大的变化,构成了中美之外的人工智能产业第三极;
第三,教授兼创业者的 Gary Marcus 发现,当下从学术界到企业界,大家不约而同地共同编织了一个人工智能的乌托邦故事……
除此之外,还有一张关于算法介绍的信息图以及机器发现药物的另一面。
当机器学会读书,它们会写书评吗?
文学批评是欧美文化中的重要组成部分,其涵盖的领域非常广泛,比如对文学作品的技巧、理念以及背后哲学性的探讨,都属于文学批评范畴之中,长期以来,文学批评圈子都是一堆受过高等教育的文化精英们的阵营,而他们的著作,或是作为专著出版,或是刊登在诸如《泰晤士报文学副刊》、《纽约客》、《伦敦书评》等刊物上。aeon 一篇长文提出一个很有趣的问题:当人类逐步进入人工智能时代,机器智能会给文学批评带来什么变化?
文章的副标题使用了「revolutionising」这个词,一开头就举了一个案例:西方文学里的「女巫」是从哪里来的?他们都广泛「生活」在哪些地方?或者说,究竟是哪些地区的文学作品里,会大量出现女巫。
这是一个很有研究价值的题目,传统的研究方法就是从海量的古代文学作品里寻找相关描述,通过诸如演绎推理、归纳推理的办法确立其中的时间、空间上的逻辑关系,而加州大学洛杉矶分校的民间故事研究者 Timothy Tangherlini 与同事则利用一个机器算法找到了其中的奥秘,该算法基于人工智能,检索了 30000 万个故事后发现,「女巫」这个词的历史背景非常有趣:
For example, they found that evil sorcery often took place close to Catholic monasteries. This made a certain amount of sense, since Catholic sites in Denmark were tarred with diabolical associations after the Protestant Reformation in the 16th century. By plotting the distance and direction of witchcraft relative to the storyteller’s location, WitchHunter also showed that enchantresses tend to be found within the local community, much closer to home than other kinds of threats. ‘Witches and robbers are human threats to the economic stability of the community,’ the researchers write. ‘Yet, while witches threaten from within, robbers are generally situated at a remove from the well-described village, often living in woods, forests, or the heath … it seems that no matter how far one goes, nor where one turns, one is in danger of encountering a witch.’
紧接着,作者由这个案例引发了一系列的疑问:
what can algorithms tell us about the stories we love to read? Any proposed answer seems to point to as many uncertainties as it resolves, especially as AI technologies grow in power. Can literature really be sliced up into computable bits of ‘information’, or is there something about the experience of reading that is irreducible? Could AI enhance literary interpretation, or will it alter the field of literary criticism beyond recognition? And could algorithms ever derive meaning from books in the way humans do, or even produce literature themselves?
这些疑问也构成了这篇文章的基本结构。事实上,自计算机尤其是人工智能出现以来,基于文本的分析一直伴随着计算机的进步,同样基于文本分析的文学批评也在不断变化,上世纪 70 年代开始,文学批评开始转向对于后现代语境里,这些变化也给计算机文本分析带来新的挑战,比如,计算机如何理解和分析马尔克斯在《百年孤独》里的开篇第一句话:
Many years later, as he faced the firing squad, Colonel Aureliano Buendía was to remember that distant afternoon when his father took him to discover ice.
分析上述复杂的时间关系就需要利用机器学习的「监督学习」——即将大量已标记的数据「喂养」给机器,从而得出一些结论。但依然面临两大挑战:
- 结论的准确性堪忧;
- 结论的准确性与标记数据的准确性有关系;
不过,这些挑战有望快速发展的机器学习算法逐步解决,对此,作者表现的非常乐观,尤其是,在涉及到机器是否会取代人类文学批评这个「敏感」话题上:
Computational analysis and ‘traditional’ literary interpretation need not be a winner-takes-all scenario. Digital technology has already started to blur the line between creators and critics. In a similar way, literary critics should start combining their deep expertise with ingenuity in their use of AI tools, as Broadwell and Tangherlini did with WitchHunter. Without algorithmic assistance, researchers would be hard-pressed to make such supernaturally intriguing findings, especially as the quantity and diversity of writing proliferates online.
欧洲人工智能全景图
如果说「软件已经吃掉了世界」,那么现在的软件世界正在被人工智能重新塑造,站在一个全球的视角去看,人工智能产业已经被美国、中国和欧洲所瓜分,美中两国的产业格局和公司分析已经非常多了,但欧洲长期以来被忽视,这份来自投资公司 Asgard 对于欧洲人工智能市场的分析就显得格外重要。
先来看一张创业公司全景图:
这份报告的核心要点如下:
- 欧洲的人工智能产业发展非常快,是一个无法忽视的市场;
- 国家分布中,英国的人工智能生态最完善,其次是德国、法国和西班牙;
- 具体到城市,伦敦毫无疑问创业公司最多的城市,其次是柏林、巴黎、马德里、斯德哥尔摩、阿姆斯特丹;
在这些人工智能公司里,数据分析、销售市场自动化、健康领域是最主要的应用领域,这和中美两国的分布图是完全不同的:
公司层面,下图是欧洲地区十大融资金额较高的公司:
过去几年,硅谷公司频频出手,收购欧洲的创业公司,目前也鲜见有华人教授参与欧洲人工智能创业公司,国内巨头也未染指这片大陆,未来的变数如何,也十分值得期待。
人工智能的乌托邦
纽约大学心理学和神经科学教授 Gary Marcus 同时也是一个创业者,他做了一家名叫 Geometric 的公司,后来卖给了 Uber,这样一位横跨学术界、产业界的人对于人工智能当下的困境非常了解,他日前在《纽约时报》上撰文分享了自己的观察:
I fear, however, that neither of our two current approaches to funding A.I. research — small research labs in the academy and significantly larger labs in private industry — is poised to succeed. I say this as someone who has experience with both models, having worked on A.I. both as an academic researcher and as the founder of a start-up company, Geometric Intelligence, which was recently acquired by Uber.
在 Marcus 看来,不管是学术性的实验室还是企业内部的实验室,其都有不小的局限性,他理想中的实验室模样是欧洲科学家们工作的场景:
I look with envy at my peers in high-energy physics, and in particular at CERN, the European Organization for Nuclear Research, a huge, international collaboration, with thousands of scientists and billions of dollars of funding. They pursue ambitious, tightly defined projects (like using the Large Hadron Collider to discover the Higgs boson) and share their results with the world, rather than restricting them to a single country or corporation. Even the largest “open” efforts at A.I., like OpenAI, which has about 50 staff members and is sponsored in part by Elon Musk, is tiny by comparison.
短期来看,公司层面对于数据、研究、算法的开放还非常有限,中长期内,人工智能的博弈会上升到国家层面,由此引发的一定是封闭,而绝不可能是开放、合作。
什么是算法?
理解算法已经成为 21 世纪新一代人类的基本能力,那么什么是算法呢?下面这张信息图介绍了算法的起源以如何影响你我的生活:
另外则是一个关于 Google PageRank 算法的介绍。
所谓机器辅助药物发现的真实一面
以深度学习为代表的人工智能正大踏步地给医疗领域带来深刻变革,比如新药的发现,就像下面这段话所言:
“Drug companies know they simply cannot be without these computer techniques. They make drug design more rational. How? By helping scientists learn what is necessary, on the molecular level, to cure the body, then enabling them to tailor-make a drug to do the job… This whole approach is helping us avoid the blind alleys before we even step into the lab… Pharmaceutical firms are familiar with those alleys. Out of every 8,000 compounds the companies screen for medicinal use, only one reaches the market. The computer should help lower those odds … This means that chemists will not be tied up for weeks, sometimes months, painstakingly assembling test drugs that a computer could show to have little chance of working. The potential saving to the pharmaceutical industry: millions of dollars and thousands of man-hours”
这听起来就像是某家硅谷创业公司的宣言,但实际上这段话出自于 1981 年额一本杂志,文章的标题叫「 Designing Drugs With Computers」,换句话说,借助计算机进行的新药发现早已持续了 30 多年,但进展之缓慢也是令人不解。而在硅谷 Atlas 资本合伙人 Natacha Dugas 看来,新技术,比如深度学习、云计算等给新药发现带来了希望,但离理想状态还有非常远的距离:
it’s only one of many contributors to overcoming the challenges of drug discovery today. There remains a wonderful abundance of artful empiricism in the discovery of new drugs, especially around human biology, and this should be embraced