data science
AI Framework New AI business idea with framework Presentation design is pretty personal and cultural. It depends on your preference about design, complexity of messages and organizational culture. Even with completely same message, the bes…
What is "prompt engineering"? Prompt engineering is a relatively new discipline for developing and optimizing prompts to efficiently use language models (LMs) for a wide variety of applications and research topics. Prompt engineering skill…
The ChatGPT era has revolutionized artificial intelligence and language processing, but with these advancements comes a new set of challenges. One of the most pressing concerns is the future shortage of training data. In this article, we'l…
www.youtube.com Formula Problem statement If from <input> we could know <output>, then <someone> could derive a benefit by Inputs(X) What are the inputs X? What does X fail to encode? What problems might this create? Outputs(Y) How is Y constructed? - Automation </someone></output>…
A framework for understanding sources of harm throughout the Machine Learning Life Cycle As I introduced several times, I am really interested in data quality and its impact on AI. www.youtube.com Academic article: A Framework for Understa…
AI would be a true threat to humans? If you are familiar with sci-fi, AI is a definitely threatening technology that potentially destroys human beings as Skynet in Terminator 2 did. But will it really happen? waitbutwhy.com The point is th…
Summary of ChatGPT Everyone becomes mad about ChatGPT. There are plenty of articles explaining how innovative it is and why it is disruptive. I just summarized my thoughts about ChatGPT technology in NLP context and picked up some insightf…
ChatGPT creates a huge phenomenon and shows potential impact of NLP model. However, many people struggle with what differentiates it from other past model and what would happen next. This article is a good summary of chat GPT if you are no…
Question Answering Tasks in Machine Learning Question answering is one of the major problem sets in NLP, in a nut shell, this is simpler Reading Comprehension problem in GMAT or other types. There are multiple types of questions and texts,…
What is overfitting? Overfitting occurs when the model has a high variance, i.e., the model performs well on the training data but does not perform accurately in the evaluation set. What is Overfitting in Deep Learning [+10 Ways to Avoid I…
Torkenization What is Tokenization | Methods to Perform Tokenization Tokenization is essentially splitting a phrase, sentence, paragraph, or an entire text document into smaller units, such as individual words or terms. Each of these small…
NLP = playing with unstructured string data String data is super messy. Before running ML or advanced analytics, we need to cleanse data. Encoding is another tricky issue in text data analysis. Remember, Emoji. We need to understand ASCII …
The confusion matrix is super confusing. https://www.researchgate.net/figure/Confusion-matrix-and-performance-equations-The-confusion-matrix-included-four_fig1_340034692 Everytime I forget the definition of TP, FP, etc, and metrics as well…
Categories of ML Supervised vs Unsupervised Model based vs instance based Online vs Offline Supervised vs Unsupervised Supervised MLLearning a mapping of inputs to outputs. We have examples of both and we find algorithms that can lean this…
Data, Model and Decisions is one of the typical MIT courses. We learn a lot of statistical analysis and optimization through this course. We had the first team assignment and this assignment is awesome. We work for the restaurant chains an…
欧米では、AIを語る際に、”Human-centric"という言葉が良く語られる。 例えば、Stanford大学のHAI Home | Stanford HAI日本語だと、人中心のAI、なのだが、いくつかの文脈が発生していると思ったので、今回紹介したい。おっさんたちが大好きなスカイネット …
www.adexchanger.com 昨今のデジタル広告に関するID周りのトレンドは明らかに下火だ。 アップル、Googleはモバイル及びブラウザ周りの追跡性のあるIDをすべて第三者提供しない方向性で進めている。これはターゲティングだけでなく、デジタル広告のアドバンテ…
2021年6月20日(日)、暑い曇り空の中、横浜で個人情報保護士試験を受けてきた。 Motivation データサイエンティストとして、ちょっと個人情報詳しいコーポレートのおじさんにマウント取られるのがマジで嫌だったというのが本音だが、もう少しひも解く。 ・…
デジタルテクノロジーと国際政治の力学を読んだ デジタルテクノロジーと国際政治の力学 (NewsPicksパブリッシング) 作者:塩野誠 発売日: 2020/10/07 メディア: Kindle版 現在、テクノロジーの進展はもはやビジネス、経済の問題だけではなく、政治の問題、安…
AIに人間の仕事が代替されると言われて結構時間が経っていることに気づいた。 多分初めの方にそれが述べられていたのは、下記の野村総研のレポート。 https://www.nri.com/-/media/Corporate/jp/Files/PDF/news/newsrelease/cc/2015/151202_1.pdf ここでは、…
The Actual Difference Between Statistics and Machine Learning Mediumの古いネタを引っ張り出してきた。 英語読める人は、本文を見てほしい。 “The major difference between machine learning and statistics is their purpose. Machine learning models…
ちょっと煽り気味のタイトルにしてみたけど、言いたいことはそんなにずれていない。 FAANG(除くApple)という会社の強みを根本的に支える強みが何かということをみんなシンプルに考えていない。 莫大なキャッシュと投資? それは太いビジネスとIPOによって後…
今の私のロール 今、機械学習エンジニアとして、ロンドンのデジタルエージェンシーでクラウドコンピューティングを使ったプロダクト開発に従事してます。エンドユーザーは、グローバルに広がるトレーディングデスクチームです。正しく設計されていれば、一気…
あんまり普段仕事の話を書かないのですが、お仕事ではもっぱらGoogle Cloudをいじり倒しています。 ちょっと前までは、割とウェブログ系のデータをいじいじとしながら、時系列予測や、ありがちなCV期待値みたいなものをScikit-learnとかKerasでグリグリモデ…