Archives

All posts for the month May, 2021

Training a single AI model can emit as much carbon as five cars in their lifetimes – MIT Tech Review

Posted by timmreardon on 05/29/2021

Posted in: Uncategorized. Leave a comment

Deep learning has a terrible carbon footprint.by

Karen Haoarchive page

June 6, 2019

The artificial-intelligence industry is often compared to the oil industry: once mined and refined, data, like oil, can be a highly lucrative commodity. Now it seems the metaphor may extend even further. Like its fossil-fuel counterpart, the process of deep learning has an outsize environmental impact.

In a new paper, researchers at the University of Massachusetts, Amherst, performed a life cycle assessment for training several common large AI models. They found that the process can emit more than 626,000 pounds of carbon dioxide equivalent—nearly five times the lifetime emissions of the average American car (and that includes manufacture of the car itself).

It’s a jarring quantification of something AI researchers have suspected for a long time. “While probably many of us have thought of this in an abstract, vague level, the figures really show the magnitude of the problem,” says Carlos Gómez-Rodríguez, a computer scientist at the University of A Coruña in Spain, who was not involved in the research. “Neither I nor other researchers I’ve discussed them with thought the environmental impact was that substantial.”

The carbon footprint of natural-language processing

The paper specifically examines the model training process for natural-language processing (NLP), the subfield of AI that focuses on teaching machines to handle human language. In the last two years, the NLP community has reached several noteworthy performance milestones in machine translation, sentence completion, and other standard benchmarking tasks. OpenAI’s infamous GPT-2 model, as one example, excelled at writing convincing fake news articles.

But such advances have required training ever larger models on sprawling data sets of sentences scraped from the internet. The approach is computationally expensive—and highly energy intensive.

The researchers looked at four models in the field that have been responsible for the biggest leaps in performance: the Transformer, ELMo, BERT, and GPT-2. They trained each on a single GPU for up to a day to measure its power draw. They then used the number of training hours listed in the model’s original papers to calculate the total energy consumed over the complete training process. That number was converted into pounds of carbon dioxide equivalent based on the average energy mix in the US, which closely matches the energy mix used by Amazon’s AWS, the largest cloud services provider.

They found that the computational and environmental costs of training grew proportionally to model size and then exploded when additional tuning steps were used to increase the model’s final accuracy. In particular, they found that a tuning process known as neural architecture search, which tries to optimize a model by incrementally tweaking a neural network’s design through exhaustive trial and error, had extraordinarily high associated costs for little performance benefit. Without it, the most costly model, BERT, had a carbon footprint of roughly 1,400 pounds of carbon dioxide equivalent, close to a round-trip trans-America flight for one person.

What’s more, the researchers note that the figures should only be considered as baselines. “Training a single model is the minimum amount of work you can do,” says Emma Strubell, a PhD candidate at the University of Massachusetts, Amherst, and the lead author of the paper. In practice, it’s much more likely that AI researchers would develop a new model from scratch or adapt an existing model to a new data set, either of which can require many more rounds of training and tuning.

To get a better handle on what the full development pipeline might look like in terms of carbon footprint, Strubell and her colleagues used a model they’d produced in a previous paper as a case study. They found that the process of building and testing a final paper-worthy model required training 4,789 models over a six-month period. Converted to CO2 equivalent, it emitted more than 78,000 pounds and is likely representative of typical work in the field.

The significance of those figures is colossal—especially when considering the current trends in AI research. “In general, much of the latest research in AI neglects efficiency, as very large neural networks have been found to be useful for a variety of tasks, and companies and institutions that have abundant access to computational resources can leverage this to obtain a competitive advantage,” Gómez-Rodríguez says. “This kind of analysis needed to be done to raise awareness about the resources being spent […] and will spark a debate.”

“What probably many of us did not comprehend is the scale of it until we saw these comparisons,” echoed Siva Reddy, a postdoc at Stanford University who was not involved in the research.

The privatization of AI research

The results underscore another growing problem in AI, too: the sheer intensity of resources now required to produce paper-worthy results has made it increasingly challenging for people working in academia to continue contributing to research.

“This trend toward training huge models on tons of data is not feasible for academics—grad students especially, because we don’t have the computational resources,” says Strubell. “So there’s an issue of equitable access between researchers in academia versus researchers in industry.”

Strubell and her coauthors hope that their colleagues will heed the paper’s findings and help level the playing field by investing in developing more efficient hardware and algorithms.

Reddy agrees. “Human brains can do amazing things with little power consumption,” he says. “The bigger question is how can we build such machines.”

Article link: https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/?

Article meta

Author

Karen Hao

https://d0df1b1c6e010ca2fc94c4254c6863a8.safeframe.googlesyndication.com/safeframe/1-0-38/html/container.html

Big Tech’s guide to talking about AI ethics

Posted by timmreardon on 05/16/2021

Posted in: Uncategorized. Leave a comment

After spending a few years cutting through Big Tech’s B.S. about AI ethics, one of our AI writers created a glossary to help you decode what all of their favorite terms actually mean.

tech #technews #mit #mittechnologyreview #technologyreview #techreview #bigtech #artificialintelligence #siliconvalley #aiethics

AI researchers often say good machine learning is really more art than science. The same could be said for effective public relations. Selecting the right words to strike a positive tone or reframe the conversation about AI is a delicate task: done well, it can strengthen one’s brand image, but done poorly, it can trigger an even greater backlash.

The tech giants would know. Over the last few years, they’ve had to learn this art quickly as they’ve faced increasing public distrust of their actions and intensifying criticism about their AI research and technologies.

Now they’ve developed a new vocabulary to use when they want to assure the public that they care deeply about developing AI responsibly—but want to make sure they don’t invite too much scrutiny. Here’s an insider’s guide to decoding their language and challenging the assumptions and values baked in.

accountability (n) – The act of holding someone else responsible for the consequences when your AI system fails.

accuracy (n) – Technical correctness. The most important measure of success in evaluating an AI model’s performance. See validation.

adversary (n) – A lone engineer capable of disrupting your powerful revenue-generating AI system. See robustness, security.

alignment (n) – The challenge of designing AI systems that do what we tell them to and value what we value. Purposely abstract. Avoid using real examples of harmful unintended consequences. See safety.

artificial general intelligence (phrase) – A hypothetical AI god that’s probably far off in the future but also maybe imminent. Can be really good or really bad whichever is more rhetorically useful. Obviously you’re building the good one. Which is expensive. Therefore, you need more money. See long-term risks.

audit (n) – A review that you pay someone else to do of your company or AI system so that you appear more transparent without needing to change anything. See impact assessment.

augment (v) – To increase the productivity of white-collar workers. Side effect: automating away blue-collar jobs. Sad but inevitable.

beneficial (adj) – A blanket descriptorfor what you are trying to build. Conveniently ill-defined. See value.

by design (ph) – As in “fairness by design” or “accountability by design.” A phrase to signal that you are thinking hard about important things from the beginning.

compliance (n) – The act of following the law. Anything that isn’t illegal goes.

data labelers (ph) – The people who allegedly exist behind Amazon’s Mechanical Turk interface to do data cleaning work for cheap. Unsure who they are. Never met them.

democratize (v) – To scale a technology at all costs. A justification for concentrating resources. See scale.

diversity, equity, and inclusion (ph) – The act of hiring engineers and researchers from marginalized groups so you can parade them around to the public. If they challenge the status quo, fire them.

efficiency (n) – The use of less data, memory, staff, or energy to build an AI system.

ethics board (ph) – A group of advisors without real power, convened to create the appearance that your company is actively listening. Examples: Google’s AI ethics board (canceled), Facebook’s Oversight Board (still standing).

ethics principles (ph) – A set of truisms used to signal your good intentions. Keep it high-level. The vaguer the language, the better. See responsible AI.

explainable (adj) – For describing an AI system that you, the developer, and the user can understand. Much harder to achieve for the people it’s used on. Probably not worth the effort. See interpretable.

fairness (n) – A complicated notion of impartiality used to describe unbiased algorithms. Can be defined in dozens of ways based on your preference.

for good (ph) – As in “AI for good” or “data for good.” An initiative completely tangential to your core business that helps you generate good publicity.

foresight (n) – The ability to peer into the future. Basically impossible: thus, a perfectly reasonable explanation for why you can’t rid your AI system of unintended consequences.

framework (n) – A set of guidelines for making decisions. A good way to appear thoughtful and measured while delaying actual decision-making.

generalizable (adj) – The sign of a good AI model. One that continues to work under changing conditions. See real world.

governance (n) – Bureaucracy.

human-centered design (ph) – A process that involves using “personas” to imagine what an average user might want from your AI system. May involve soliciting feedback from actual users. Only if there’s time. See stakeholders.

human in the loop (ph) – Any person that is part of an AI system. Responsibilities range from faking the system’s capabilities to warding off accusations of automation.

impact assessment (ph) – A review that you do yourself of your company or AI system to show your willingness to consider its downsides without changing anything. See audit.

interpretable (adj) – Description of an AI system whose computation you, the developer, can follow step by step to understand how it arrived at its answer. Actually probably just linear regression. AI sounds better.

integrity (n) – Issues that undermine the technical performance of your model or your company’s ability to scale. Not to be confused with issues that are bad for society. Not to be confused with honesty.

interdisciplinary (adj) – Term used of any team or project involving people who do not code: user researchers, product managers, moral philosophers. Especially moral philosophers.

long-term risks (n) – Bad things that could have catastrophic effects in the far-off future. Probably will never happen, but more important to study and avoid than the immediate harms of existing AI systems.

partners (n) – Other elite groups who share your worldview and can work with you to maintain the status quo. See stakeholders.

privacy trade-off (ph) – The noble sacrifice of individual control over personal information for group benefits like AI-driven health-care advancements, which also happen to be highly profitable.

progress (n) – Scientific and technological advancement. An inherent good.

real world (ph) – The opposite of the simulated world. A dynamic physical environment filled with unexpected surprises that AI models are trained to survive. Not to be confused with humans and society.

regulation (n) – What you call for to shift the responsibility for mitigating harmful AI onto policymakers. Not to be confused with policies that would hinder your growth.

responsible AI (n)- A moniker for any work at your company that could be construed by the public as a sincere effort to mitigate the harms of your AI systems.

robustness (n) – The ability of an AI model to function consistently and accurately under nefarious attempts to feed it corrupted data.

safety (n)- The challenge of building AI systems that don’t go rogue from the designer’s intentions. Not to be confused with building AI systems that don’t fail. See alignment.

scale (n)- The de facto end state that any good AI system should strive to achieve.

security (n) – The act of protecting valuable or sensitive data and AI models from being breached by bad actors. See adversary.

stakeholders (n) – Shareholders, regulators, users. The people in power you want to keep happy.

transparency (n) – Revealing your data and code. Bad for proprietary and sensitive information. Thus really hard; quite frankly, even impossible. Not to be confused with clear communication about how your system actually works.

trustworthy (adj) – An assessment of an AI system that can be manufactured with enough coordinated publicity.

universal basic income (ph) – The idea that paying everyone a fixed salary will solve the massive economic upheaval caused when automation leads to widespread job loss. Popularized by 2020 presidential candidate Andrew Yang. See wealth redistribution.

validation (n) – The process of testing an AI model on data other than the data it was trained on, to check that it is still accurate.

value (n) – An intangible benefit rendered to your users that makes you a lot of money.

values (n) – You have them. Remind people.

wealth redistribution (ph) – A useful idea to dangle around when people scrutinize you for using way too many resources and making way too much money. How would wealth redistribution work? Universal basic income, of course. Also not something you could figure out yourself. Would require regulation. See regulation.

withhold publication (ph) – The benevolent act of choosing not to open-source your code because it could fall into the hands of a bad actor. Better to limit access to partners who can afford it.2 free stories remaining Sign in Subscribe now

Article link:
https://www.technologyreview.com/2021/04/13/1022568/big-tech-ai-ethics-guide/

How Xiaomi Became an Internet-of-Things Powerhouse – HBR

Posted by timmreardon on 05/06/2021

Posted in: Uncategorized. Leave a comment

When Xiaomi entered the fiercely competitive smartphone market in 2010, it did so without even offering a real phone. The company only offered a free Android-based operating system (OS). Yet, within seven years, Xiaomi became one of the world’s largest smartphone makers, reaching $15 billion in revenue. Accelerating its growth rate, Xiaomi transformed into the world’s largest consumer IoT (Internet of Things) firm by 2020, with its revenue surpassing $37 billion and more than 210 million IoT devices (excluding smartphones and laptops) sold across more than 90 countries. How was Xiaomi able to grow so explosively and what lessons can other companies learn from Xiaomi’s rise?

We sought answers via an in-depth, multi-year study of the firm, including extensive interviews with 12 top executives (including cofounders, chairman, CEO, president, senior VPs, and executives leading R&D, distribution, and marketing), as well as the founder and CEO of Smartmi, Xiaomi’s largest ecosystem partner. Our research also involved analyzing more than 100 hours of conversations and reviewing more than 5,000 Xiaomi documents (from 2010–2020) as well as 470 external reports and data sets.

We learned that the secret to Xiaomi’s growth lies in what we term as “strategic coalescence.” The word “coalesce” originates from the Latin words co (“together”) and alescere (“to grow”). Strategic coalescence thus refers to a process through which a firm intimately connects with demand and supply-side stakeholders, bolsters tangible benefits for all, and triggers exponential market growth. Let’s first understand the key aspects of strategic coalescence at Xiaomi.

Coalescence with consumers

Xiaomi entered its first market — China — by offering a smartphone OS, called MIUI, for free. At the time, there were several strong domestic (e.g., Huawei, Lenovo) and international players (e.g., Apple, Samsung) battling over every tier of the market, from economical to premium. Most Chinese manufacturers simply smacked the Chinese version of Android on their smartphones, with little customization.

Instead of competing head on, Xiaomi courted tech-savvy smartphone users by offering them free software and building a fully-fledged online community to engage with them and understand which features they craved and which they disliked. This segment of consumers loved the unprecedented attention from a tech firm and were highly motivated to interact and contribute suggestions.

Xiaomi released a new OS version for download every Friday afternoon, as its tech-savvy consumers were heading home for the weekend. Its engineers followed up on user suggestions as soon as they were received, often corresponding with users to resolve issues together. This co-development process enhanced Xiaomi’s brand awareness and likability and prepared a segment of potential consumers for the entry of Xiaomi’s phones, without spending money on traditional advertising.

When it introduced its first phone in August 2011, Xiaomi positioned itself as offering “quality technology at an affordable price.” It sold directly to consumers, through its own website, at a margin of below 5% — the thinnest margin in the industry. Because of its direct engagement with tech-savvy consumers, Xiaomi was able to trim out all intermediaries — the many tiers of national, regional and local wholesalers and retailers, each of which charged a markup. Its direct-to-consumer approach created a significant cost advantage — the phone’s feature to price ratio was far more favorable than anything else on the market — and increased the speed at which Xiaomi could reach its consumers. Target consumers responded: Demand outpaced production so much that the firm could only open its e-commerce site one day per week and stocks sometimes sold out within minutes. The constant and instant sell-outs led to social media storms, spreading the brand to an ever-wider audience, stimulating further demand.

Coalescing operations around the core value proposition

After gaining a foothold in the tech-savvy, value-conscious segment in the top cities, Xiaomi began to expand into other segments — consumers who were less tech savvy, as well as those residing in smaller cities. Many of these consumers preferred an offline shopping experience, wanting to discuss their needs with a staff member or get a demonstration.

To serve these new customers, Xiaomi built an offline retail infrastructure, setting up hundreds of stores spanning major metros and small cities. Unlike other smartphone makers, who co-located their stores at the “telecom street” (an area dedicated to telecom stores), Xiaomi set up its stores in locations with high foot traffic, like malls, where its new target consumers were likely to shop. Importantly, Xiaomi chose malls where existing “high value at a reasonable price” anchor stores could help reinforce its own positioning. It also started offering different sub-brands (Redmi as an economical product line and Mi MIX for more advanced tech seekers), always ensuring that the features-to-price ratio of each new phone was more appealing than competing products.

In sum, during its initial phase, Xiaomi focused on quickly building a large smartphone consumer base across value-seeking consumer segments along with an appropriate on- and offline distribution infrastructure, always keeping its promised low margin on hardware. This enabled it to achieve massive volumes. Xiaomi expanded the share of wallet of this huge and growing customer base, with higher margin post-purchase services (commissions on music, videos, or game purchases), to help attain profitability. These laid the foundation for Xiaomi’s subsequent IoT endeavors.

Leveraging coalescing synergies

Xiaomi’s expansion into the IoT sphere was further empowered by four coalescing synergies.

In-Home IoT Synergy

Xiaomi leveraged its smartphones as an “omni-remote control” and began launching products that could be linked to and controlled by its phones (such as, TVs, air conditioners, air purifiers, smart lamps). In addition to developing its own products, Xiaomi sought partners who could help the firm quickly expand the range of its IoT offerings. The products from the partners were easily integrated into Xiaomi’s in-home system as they were built on its IoT protocol. This meant that once consumers acquired their first Xiaomi IoT product, they were more likely to seek out other products from Xiaomi. In other words, it became progressively harder for competitors to lure customers away in an IoT category.

Design Aesthetics Synergy

To further strengthen the bond between Xiaomi and its customers, the firm ensured that all Xiaomi-branded IoT products, including those manufactured by ecosystem partners, followed similar design aesthetics. So, if a consumer purchased another Xiaomi product, that item would be more aesthetically congruent with the Xiaomi products they already owned, creating synergy through design congruency and visual gestalt.

Product Portfolio Synergy

A key challenge associated with offline distribution is the high and ever-increasing square footage cost, especially at prime locations. Non-smartphone products (including those by partner firms) could yield much higher margins than smartphones, making the opening and running of offline stores more financially viable. Also, selling a variety of products in the stores attracted consumers who were not specifically looking for smartphones, creating opportunities to promote Xiaomi smartphones and, more generally, to cross-sell its entire portfolio. Furthermore, a broader product portfolio encompassing items with shorter replacement cycles (such as fitness bands and smart light bulbs) created higher foot traffic, leading to additional unplanned purchases and cross-selling opportunities in store.

Multi-Channel Synergy

To maximize returns on its brick-and-mortar stores, Xiaomi leveraged online sales data, using analytics to inform which products to sell offline and how to optimize the product mix at the store level. Offline stores were leveraged to offer potential consumers demonstrations for more experiential products (such as vacuum cleaner robots or AI speakers), moving those potential customers along the decision process whereby the demonstration could either seal an immediate purchase or nudge the consumer towards a later online purchase. The latter provided an additional multichannel synergy, from offline to online.

These four synergies coalesced together, amplifying each other’s effect. Consequently, Xiaomi was able to attract a growing number of potential customers to visit its stores (as opposed to smartphone competitors’ stores) and enhance the likelihood that they make purchases within Xiaomi’s ecosystem. This propelled rapid adoption of Xiaomi’s IoT products, with customers frequently visiting Xiaomi stores and purchasing multiple items.

Coalescence with partners

To effectively expand into categories outside Xiaomi’s expertise and bolster the four synergies, Xiaomi implemented a unique process for identifying and developing partnerships. These yielded some advantages:

Partners were hand-picked by Xiaomi cofounders and top executives through their personal networks. Because of the close personal connections, executives at Xiaomi had in-depth knowledge about each partner. They understood the capabilities and values of the management team, enabling Xiaomi to better assess the likelihood of collaboration success.
Leveraging personal networks meant that Xiaomi executives were also well connected to the social network of each partner. If some partners performed poorly or violated the partnership agreement, there would be an immediate and direct reputational cost to them, making it harder for them to leverage their social network for further business endeavors, a crucial success factor, particularly in the Chinese business context. This social cost complemented the economic incentive for partnering, bolstering the likelihood of successful collaboration. Certainly, this cherry-picking approach also had drawbacks — it limited the number of potential partners from which Xiaomi could select. However, in the case of Xiaomi’s IoT transformation, its executives believed the pros outweighed the cons.
Xiaomi invested in the partner firms but did not acquire controlling shares. While the investment incurred a risk, it created significant benefits. The “co-owner” relationship facilitated communication and increased trust. Xiaomi was able to gain access to information about each partner’s cost structure and operations, as well as participate in their business decisions. Because partners retained majority shares, they were motivated to develop and sell successful products. As a shareholder, Xiaomi benefited from the growth of its partner firms and the profits they made. Simply put, this form of co-ownership created a win-win outcome for both Xiaomi and its partners.
Xiaomi purposefully selected firms that were small or startups, so that partnering with Xiaomi offered significant value to them. These firms typically focused on a single category of products, and this specialization meant higher likelihood of producing great products. One important benefit Xiaomi offered to its partners was “incubation”: It assisted them with R&D by sending in teams of its own engineers, and it helped its partners identify key suppliers and negotiate contracts. Xiaomi’s investment and operational involvement brought brand awareness and prestige, making suppliers more willing to offer favorable terms to partner firms (compared to a “nobody” startup). Importantly, by ensuring that partner firms had access to solid designs and used quality inputs at reasonable cost, Xiaomi safeguarded the quality and price attractiveness of their final products.

These approaches enabled Xiaomi to effectively manage the partner network and offer an ever-growing portfolio of products consistent with the Xiaomi brand in design, aesthetics, quality, and technology/price ratio. Xiaomi’s coalescence with partners laid another foundation for the firm to become a global IoT giant.

Xiaomi’s growth path differs from conventional strategic thinking. While we are often taught that a firm’s strategy should be based on either cost leadership or differentiation and must serve either a few needs of a broad segment or broad needs of a narrow segment, Xiaomi is clearly an outlier. It differentiated on multiple frontiers and at the same time attained cost leadership. It achieved these through strategic coalescence — by coalescing with consumers and partners, which erected and continuously fortified barriers to entry on both the demand and supply sides. This resulting sustainable competitive advantage catapulted Xiaomi forward at warp speed.

Haiyang Yang is an associate professor at the Johns Hopkins Carey Business School, Johns Hopkins University. His research focuses on decision-making. His work has appeared in premier journals such as the Journal of Marketing Research, Journal of Consumer Research, Journal of Consumer Psychology, and Psychological Science.

Head shot of Jingjing Ma

Jingjing Ma is an assistant professor at the National School of Development, Peking University. Her research focuses on marketing. Her work has appeared in premier journals such as the Journal of Marketing Research, Journal of Consumer Research, and Journal of Consumer Psychology.

Article link: How Xiaomi Became an Internet-of-Things Powerhouse (ampproject.org)

Amitava Chattopadhyay is the GlaxoSmithKline Chaired Professor of Corporate Innovation at INSEAD. He is co-author of The New Emerging Market Multinationals: Four Strategies for Disrupting Markets and Building Brands and has published extensively in premier journals such as the Journal of Marketing, Journal of Marketing Research, Journal of Consumer Research, Marketing Science, and Management Science. You can follow him on Twitter @AmitavaChats.

healthcarereimagined

Envisioning healthcare for the 21st century

Archives

All posts for the month May, 2021

Training a single AI model can emit as much carbon as five cars in their lifetimes – MIT Tech Review

The carbon footprint of natural-language processing

The privatization of AI research

Article meta

Share

Author

Big Tech’s guide to talking about AI ethics

tech #technews #mit #mittechnologyreview #technologyreview #techreview #bigtech #artificialintelligence #siliconvalley #aiethics

How Xiaomi Became an Internet-of-Things Powerhouse – HBR

Coalescence with consumers

Coalescing operations around the core value proposition

Leveraging coalescing synergies

In-Home IoT Synergy

Design Aesthetics Synergy

Product Portfolio Synergy

Multi-Channel Synergy

Coalescence with partners

Search site

Recent Posts

Categories

Archives

Tags

Upcoming Events