Build intelligent apps with machine learning and large language models.
Try in Applaa Builder — FreeAI often starts with data: numbers, lists, categories. A simple 'pattern' might be: the average of a list, or the most common item. Here we work with a list of numbers and find the average—like a tiny step toward what ML does with lots of data.
A very simple 'model' is a rule we write by hand. For example: if score >= 80 then predict 'pass'. Machine learning later learns such rules from data. Here we write a small rule and use it on a few examples.
AI programs often loop over lots of data: for item in data: ... We might count, sum, or check a condition. Here we loop over a list and collect a result—like building a simple 'dataset' of answers.
Mean is the average (sum / count). Median is the middle value when sorted. Both summarize a list of numbers. AI often uses these to understand data.
Group data by category and count. Example: count how many fruits are 'apple', 'banana', etc. This is like building a simple histogram—useful before training a classifier.
Many simple AI rules use a threshold: if score > 70 then 'pass'. You can combine several rules. This is the idea behind decision rules and decision trees.
Data for AI often comes as a list of records (dicts): each item has the same keys (e.g. age, score, result). You can loop and filter or aggregate by key.
Sometimes we scale numbers to a range (e.g. 0–1). One way: (x - min) / (max - min). This can help when comparing different features in data.
A simple way to 'combine' several answers is majority vote: count each label and pick the one that appears most. Used in simple ensemble ideas.
In 2D, the distance between (x1,y1) and (x2,y2) is sqrt((x2-x1)**2 + (y2-y1)**2). Similar points have small distance. This idea is used in nearest-neighbour and clustering.
In ML we often split data: use part to 'train' (learn) and part to 'test' (check how well it works on new data). Here we simulate: take 80% of a list as train, 20% as test.
Features are the inputs we use to make a prediction (e.g. age, score, colour). We often store them as numbers or categories. Good features help the model; bad ones don't.
A very simple rule (e.g. always predict 'yes') might miss patterns (bias). A very complex rule might fit noise (variance). In practice we try to find a balance.
Accuracy = correct predictions / total predictions. After we have predictions and true answers, we count how many match and divide by total. It's the simplest metric for classification.
LLMs take text (a prompt) and generate more text. The prompt tells the model what to do. In code we might call an API with a prompt. Here we simulate: a function that 'responds' based on keywords.
AI can reflect biases in data. If training data is unfair, the model might be unfair too. We should think about who is affected and whether the system is fair. Checking data and results helps.
Real AI often has a pipeline: load data → clean (fix missing, errors) → transform (features) → train → evaluate. Here we do a tiny pipeline: load a list, filter, then compute a stat.
LLMs see text as tokens (pieces: words or subwords). Longer text = more tokens. We can simulate by splitting a string into words and counting.
An embedding turns text (or other data) into a list of numbers so that similar things have similar numbers. We don't build one here—we just understand: same idea as 'encode as a vector' for comparison.
You've seen: data and patterns, simple rules, loops over data, mean/median, counting by category, thresholds, train/test split, features, accuracy, prompts, fairness, pipelines, tokens, embeddings. These are building blocks for real ML and LLMs.