February 22, 2024
Reference # https://angeloyeo.github.io/2020/09/16/rejection_sampling.html
https://www.youtube.com/watch?v=yCv2N7wGDCw
gpt conversation: https://chat.openai.com/share/c65f4fe6-3274-4267-a67e-41ba780ef198
Likelihood # Suppose you have data, and you are speculating what parameters have generated the data.
For instance, when a someone is test positive (data) in a doping test, you are wondering if he actually have doped (parameter: probability of doping, or whether he doped or not).
The likelihood is \( P(\text{test positive} | \text{dope}\) or \( P(\text{test positive} | \text{not dope})\)
...
February 19, 2024
This is reinterpretation of the book, The Art of statistic, learning from data.
What is data # An individual has no interpretation in itself in statistics.
It’s the relationship to the whole (many individuals) which gives the meaning of an individual.
It is contradictory to another perspective where each individual has an inherent meaning.
natural rights properties of human (homo sapience, homo ludens..) And statistics deals with unknowns. If we know everything about something, if we know it deterministically, statistics don’t deal with it.
...
December 8, 2023
Statistical learning (part 3) # This is my note taking course (lecture 6.10 - 7.4) on https://learning.edx.org/course/course-v1:StanfordOnline+STATSX0001+2T2023/home
PCA # Find a direction where data varies the most. What does it mean?
Suppose you have a such direction, how do I verify the direction is the wanted direction? I.e. what is our goal?
Suppose we have a direction, rotate the axis so that the direction is horizontal.
Now our data point is plotted in this new coordinate.
...
November 29, 2023
Information theory # This is my note for reading a book Elements of information theory.
Entropy # The measure of uncertainty of probablity distribution.
\[E = E(\frac{1}{\log(p(x))}) \]
If you have 2 side coins, your uncertainty is \( \frac{1}{2} \log(\frac{1}{2}) * 2 = 1\)
If you have 4 side coins, your uncertainty is \( \frac{1}{4} \log(\frac{1}{4}) * 4 = 2\)
And your uncertainty is measured by how many bits (the base can be changed) you need to represent the probability distribution on average.
...
November 28, 2023
Matrix methods # This is my note taking for https://ocw.mit.edu/courses/18-065-matrix-methods-in-data-analysis-signal-processing-and-machine-learning-spring-2018/pages/readings/
lecture 12-21
These are practical applications, but hard to remember.. without actually applying it somewhere..
Randomized Matrix multiplication # When matrix is huge, multiplying them is prohibitive, so we take samples (sum rank-1 multiplication) with good property.
Example # Simulating a matrix by random sampling.
Suppose \(A = \begin{bmatrix} a & b \end{bmatrix} \)
We construct a matrix \(\hat{A}\) by randomly selecting two columns with probability \( \frac{1}{2} \)
...
November 27, 2023
Perspective # I’m studying investing
Why?
Survivor # https://youtu.be/KRy2oU0Df5k?si=R47vLli0JV3S3O-Y
Three kinds of people # 시골의사 박경철, Jeremy Rifkin
creator Ford: A car can take you anywhere, a train can move on a fixed path facilitator Rockfeller: People will need gas for cars etc NY times: Ford made a car which can carry 4 but cost the money to build a train which can carry hundreds Phone: who would want that expensive phone, we have beepers The goal, the perspective # Why are you dividing the three types of human that way?
...
November 24, 2023
matrix methods # This is my note taking for https://ocw.mit.edu/courses/18-065-matrix-methods-in-data-analysis-signal-processing-and-machine-learning-spring-2018/pages/readings/
lecture 1-9 Probably I should do the problem sets.
I diagressed from statiscal learning Statistical learning (part 2) because it required some math background
Contrary to linear algebra done right (Linear algebra), the treatment of matrix is more concrete with assumption that your matrix \( M(T) \) is real and the inner product is assumed to be the inner product
...
November 19, 2023
Linear algebra # This is my take on reading a book on linear algebra. (Linear algebra done right, upto ch 7)
What is it about # Two basic ingredients # uniqueness (sameness)
vector: two vectors are different if one can’t be stretched out to match another set of vectors: two set of vectors are different if any vector in one set can’t be represented as linear combination of vectors in another set spanness
...
November 18, 2023
Statistical learning (part 2) # This is my note taking course on https://learning.edx.org/course/course-v1:StanfordOnline+STATSX0001+2T2023/home (lecture 4.8-6.9)
Linear regression (part 2) # The model # The Linear Model:
The linear regression model is typically written as: \[ Y = X\beta + \epsilon \] where \( Y \) is the response vector, \( X \) is the matrix of predictors (including a column of ones for the intercept), \( \beta \) is the vector of regression coefficients, and \( \epsilon \) represents the error terms.
...
November 17, 2023
Game # A game host has a convert ticket whose price is determined by supply and demand of the game players.. (He has many tickets for the same concert) He invites you to play the game which ends a week later.
You can bet on price movement, if you get it right, you can make money.
Specifically, you can take an action of the followings: # If you think the price will rise:
...
November 17, 2023
what we have # (P1) From linear dependent \( n \) vectors, we can have \(n-1\) vectors which span the same set. # This is true because “dependent” and “span” actually means the same thing.
(P2) # of independent vectors < # of spanning vector (finite dimensional) # For a spanning set, we could add a vector from the independent set. This act will plug out a dependent vector because of P1
...
November 14, 2023
Statistical Learning # This is my note taking course on https://learning.edx.org/course/course-v1:StanfordOnline+STATSX0001+2T2023/home (lecture 1-4.7)
gpt # linear regression https://chat.openai.com/share/30270f42-7ef7-45f7-adec-5858d3dc8360
logistic regression https://chat.openai.com/share/ab52dce1-f66a-4cc6-9807-03f1b7888e31
case-control sampling https://chat.openai.com/share/d1986c13-a01b-4bc4-afa8-e1384ddc303f
bias variance tradeoff # \[\begin{aligned} \text{MSE}(x) &= E[(Y - \hat{f}(x))^2] \\ \text{MSE}(x) &= E[(f(x) + \epsilon - \hat{f}(x))^2] \\ \text{MSE}(x) &= \text{Var}(\hat{f}(x)) + [\text{Bias}(\hat{f}(x))]^2 + \text{Var}(\epsilon) \\ \text{MSE}(x) &= \text{Var}(\hat{f}(x)) + [\text{Bias}(\hat{f}(x))]^2 + \text{Var}(\epsilon) \\ \text{Bias}(\hat{f}(x)) &= E[\hat{f}(x)] - f(x) \end{aligned} \]
linear regression # setup # \( \mathbb{E}[Y|X=4] \) means expected value of Y given \( X = 4\)
...
November 6, 2023
physics # https://www.youtube.com/playlist?list=PLFE3074A4CB751B2B
Lecture 11-19
Haven’t had time to write notes
Black Scholes (part 3) # https://chat.openai.com/share/c759ecbe-3cad-42aa-b76e-c222009ceb10
How to use Black Scholes
intro to datascience # introduction to data science
This is my note on taking course on https://learning.edx.org/course/course-v1:HarvardX+PH125.4x+1T2023/home section 1 - 6
data storage for timeseries data # https://chat.openai.com/c/2934c9b9-b7a9-4f4a-bfae-66544690f680
brin vs btree # https://www.crunchydata.com/blog/postgres-indexing-when-does-brin-win
influxdb vs postgresql # https://portavita.github.io/2018-07-31-blog_influxdb_vs_postgresql
microsoft # https://arxiv.org/pdf/2009.11189.pdf (not read yet)
November 6, 2023
math
introduction to data science # This is my note on taking course on https://learning.edx.org/course/course-v1:HarvardX+PH125.4x+1T2023/home
Mostly basic statistics.
gpt chat # overall: https://chat.openai.com/share/ffce1eb3-c872-48f4-b3fa-22b7bf6d2113 on T distribution: https://chat.openai.com/share/c123934f-8c91-4c11-9b02-a13ec528db03 on linearity: https://chat.openai.com/share/71ab41c6-5c63-43ad-9676-8d4ab108f926 polling vs forecasting # polling is looking at current situation, but forecasting uses the data to predict future
urn model # With probability assumption or modeling , a bean has probability p. We can model sum of n beans. So our sample, sum of n beans, would have properties we know (because of the assumption) That’s how we guess and measure our knolwedge about unknown p
...
October 29, 2023
Fundamentals of physics # https://www.youtube.com/playlist?list=PLFE3074A4CB751B2B
Summary of lecture 1 - 10.
Newton’s 3 laws # You need a force to change a velocity. Whereas aristoteles would have said, you need a force to move an object. \( F = ma \) This doesn’t survive QM. You need an inertial observer. (ie, one who has no accerelation) \( F_{12} = - F_{21}\) How we can remove time from the equation. # \begin{align*} & \frac{dv}{dt} = a \\ & v \frac{dv}{dt} = a v \\ & \frac{d}{d t}\left(\frac{v^2}{2}\right)=a \frac{d x}{d t} \\ & d\left(\frac{v^2}{2}\right)=a d x \\ & \int d \left(\frac{v^2}{2}\right)=\int a d x \\ & \frac{v^2}{2}-\frac{v_0^2}{2}=a\left(x-x_0\right) \\ \end{align*}
...
October 26, 2023
Resources # https://youtu.be/-DiZGpAh7T4?si=YN-a_mZ1gdwlt54n
https://web.mit.edu/kirtley/kirtley/binlustuff/literature/control/Kalman%20filter.pdf
https://arxiv.org/pdf/1910.03558.pdf
https://chat.openai.com/share/d2e080cf-607c-4492-a359-f43c4e7e3daa (gpt)
Kalman filter # The Kalman filter operates in a loop with the main stages being prediction and update.
It has model (state) which drives the pheonomenon (observation) The transition from prev state to next state is linear. The state to observation is linear.
With updated observation, you update your model so that your prediction is getting better by balancing the weight of ‘model prediction’ and ‘current observation’.
...
October 21, 2023
Random walk # A scenario where the position at the next step is determined by the position at the previous step and the independent outcome of the next coin flip. This is like flipping a coin, where each head means a step forward (+1) and each tail means a step backward (-1). Each step is independent of others and has equal probability.
\[ \begin{aligned} X_i = \begin{cases} 1 & \text{with probability } \frac{1}{2} \\ -1 & \text{with probability } \frac{1}{2} \\ \end{cases} \end{aligned} \]
...
October 20, 2023
Mathematical definition of distribution of wealth # Gini \[ G = \frac{1}{2} \int_{0}^{1} |F(y) - y| dy \] Theil Index \[ T = \sum_{i=1}^{n} p_i \ln\left(\frac{p_i}{q_i}\right) \] Palma ratio, Atkinson Index, Wealth Concentration .. How is spread measured? # MAD (mean absolute difference) # You average absolute difference from mean.
Gini is essentially MAD.
\[ G = E(|Y-\mu|) = \frac{1}{2} \int_{0}^{1} |F(y) - y| dy \]
Variance (or STD) # You average squared difference from mean.
...
October 20, 2023
Israel vs Palestinian # Israel seems to occupy small region relative to all the palestinian allies around the region. How is Israel holding up against them?
relative strength of two sides # Brings up the notions
nominal GDP, real GDP # Nominal GDP: GDP at current price Real GDP: GDP at inflation adjusted price GDP at PPP # GDP at purchase power adjusted price This is different from currency rate adjusted price.
...
October 20, 2023
chatgpt conversation # https://chat.openai.com/share/424a4cc4-9ff3-469b-9e3c-c2ce4db355c1
Relative strength of two countires economics@investing # What’s their brief history?
How Israel is holding against bigger countries surrounding it? Leads to the question of How to measure relative strength of country.
In order to measure relative strength of country, not just aggregate GDP, but the shape of the wealth distribution might be relavant, and it leads to the next topic
Mathematical definition of distribution of wealth @investingmath # There are many definition of spread.
...
October 20, 2023
How information flows might be interesting topic # Here are some project ideas to get started with the field
Quantitative Finance: Momentum and Reversal in Stock Returns # Objective: # Analyze how quickly stock prices adjust to new information, focusing on the phenomena of momentum (stocks that have recently performed well continue to do well in the short-term) and reversal (stocks that have recently performed well tend to underperform in the long-term).
...
October 20, 2023
Framework1 # Emphasize on interpretability for learning purpose
I need to be able to interpret the result in order to learn
interpretability framework
Framework2 # Emphasize on adaptability.
Nothing is built on stone; All is built on sand, but we must build as if the sand were stone. - Jorge Luis Borges
Framework3 # Emphasize on what is fundamental.
more dimensions # There seems to be many more dimensions one can catgorize strategies.
...
October 20, 2023
Q1: Does debate assume an assertion that can be answered yes or no? # What does it mean to ask the Q1 to another person? # Decidable Statements # Let’s consider a formalization of the idea that there exists a statement with a determinate truth value.
Definitions # Let \( \mathcal{S} \) be the set of all statements. Define the truth value function \( \tau: \mathcal{S} \rightarrow \{0, 1, \text{undefined}\} \) such that: \( \tau(P) = 1 \) if \( P \) is true.
...
October 18, 2023
Brownian motion # https://chat.openai.com/share/971a53a8-6c87-48d8-bed7-84d4975e2e60
\[ W_{N,t} = \sum_{i=1}^{Nt}{\frac{1}{\sqrt{N}}X_{i}} \]
We are considering N events in time t. For example, we are looking at 24 data points per day (at each hour) over multiple days.
Why are we dividing by \( \sqrt{N} \) but not \( \sqrt{Nt} \)?
In summary,
N: # Represents the number of steps or discrete events. As \( N \) becomes large (approaching infinity for the continuous case), each step’s contribution is scaled down by a factor of \( \frac{1}{N} \) so that the cumulative effect remains stationary and well-behaved.
...
October 18, 2023
Fourier transform (part2) # This is part2 after fourier series
This is summary of lecture 5-7 of https://www.youtube.com/playlist?list=PLB24BC7956EE040CD
The Gaussian fourier transform is discussed more in the chat. https://chat.openai.com/share/bc7157eb-da88-447d-ab6c-422c7bef80f2
Fourier Transform and the Heat Equation # Heat Equation for a Heated Ring # Consider a ring that is initially heated in a certain manner. We can describe the temperature distribution across the ring as a function of position \(x\) and time \(t\).
...
October 17, 2023
Definitions and Equations # https://chat.openai.com/share/e71eecee-4263-4dc6-842e-a0dee3af28b0
this is part2 after Bellman equation (part1)
Definition of \( G_t \) # It represents the random variable of cumulative rewards at time \( t \). The cumulative rewards can be expressed as:
\[\begin{aligned} G_t &= R_{t+1} + \gamma R_{t+2} + \gamma^2 R_{t+3} + \dots \\ G_t &= R_{t+1} + \gamma G_{t+1} \end{aligned} \]
Definition of \( V(s) \) # It is the expected cumulative rewards given state \( s \).
...
October 16, 2023
Bellman equation # https://chat.openai.com/share/4def344b-c94f-4194-a428-adbeb4d12175
It seems one of the backbone idea of reinforcement learning when we want to have a formula to update the decision process.
Bellman specifically provides the goal function that the agent can try to optimize. (it’s called value function)
It was used in the Q-learning algorithm for frozen-lake example. It is used in Monte carlo method for updating the estimates.
Monte Carlo method # https://chat.openai.com/share/2b0b9c46-4744-4e46-93cb-aca8e02716e0
...
October 15, 2023
fourier series # watching fourier transforms # https://www.youtube.com/playlist?list=PLB24BC7956EE040CD
Following is a summary of lecture 1 - 4. # https://chat.openai.com/share/9f767d73-175a-4f1b-b476-7c1f10988e66
heat equation # comes up as the application of fourier series
https://chat.openai.com/share/a66f3b58-13d9-4273-8713-057d72e1e527
I had a slightly miguided impression on the “periodic nature” of the problem where the object’s shape is a ring. In the end, the discussion due to the misinterpretation led me to understand what to expect from “Fourier transform” (opposed to fourier series)
...
October 14, 2023
Hypothesis # I’m not sure this will work.
this might work, because
market might be efficient but people are not rational General Strategy # need not perfect information nor perfect algorithm
I’ll perform the trading manually and see if I can automate some of my work.
Goal # personal # money # beat the stock market by a noticable magine
when stock market yields 5% return, I want to achieve 5.
...
October 14, 2023
process # select buy candidates
for each candidate
indicators produce (multiple) predictions prepare buy / sell (2nd level indicator)
combine multiple predictions on each candidate (weight them)
produce possible buy candidates
evaluate current holdings
for each holding, 2nd level indicator evaluate holding
We have bought this holding with this prediction data Is the prediction still valid? can you (indicator) evaluate the current standing?
decide buy / sell
With new candidates for buy, evaluations for current holdings
...
October 14, 2023
Book List # The intelligent investor Principles of Economics Software Engineering at Google: Lessons Learned from Programming Over Time Warren Buffett Accounting Book: Reading Financial Statements for Value Investing Algorithmic Trading: Winning Strategies and Their Rationale Mostly Harmless Econometrics: An Empiricist’s Companion An Introduction to Quantitative Finance The Outer Limits of Reason: What Science, Mathematics, and Logic Cannot Tell Us The Theory That Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy Web Scraping with Python: Collecting More Data from the Modern Web Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter Python for Finance Cookbook: Over 80 powerful recipes for effective financial data analysis, 2nd Edition Technical Analysis of the Financial Markets: A Comprehensive Guide to Trading Methods and Applications some basic math when needed some basic ai when needed current reading #
October 13, 2023
linearity # https://chat.openai.com/share/1d1f168c-0009-4906-b451-bc77dce64dc9
what is linearity (in linear algebra) # There is a mapping from an object to another object (from v to w)
two objects can be added and we can look at the mapping of the resulting object the mapping is linear when T(v1 + v2) = T(v1) + T(v2)
(there’s an additional homogeneous requirement as well and there is assumption where v is an entity of vector space with some additional conditions (such as property of +, + identity and so on)
...
October 12, 2023
chatgpt conversation # chatgpt conversation
this conversation arised from another conversation on RL (reinforcement learning)
remaining questions # usefulness of MLE # MLE is like asking what’s the most likely universe we are in, no it’s not it’s like what’s the most likely universe considering the probability of event only, and it only.
so if we want to pick which universe we want to go to, MLE gives the answer.
...
October 12, 2023
chatgpt conversation # chatgpt conversation
Frozen Lake (Q-learning) # frozen lake (Q learning) implementation
Frozen Lake (policy gradient) # implementation
heuristic aspect of RI # definition # There is a goal. Goal can be measured numerically in intermediate states. (Reward)
Find a way to reach the goal using the reward as hint. One often has to react to intermediate state (without perfect information) Often, model it by using decision (Policy) acting on the perceived state
...
October 12, 2023
import gym import numpy as np import torch import torch.nn as nn import torch.optim as optim class PolicyNet(nn.Module): def __init__(self, n_states, n_actions): super(PolicyNet, self).__init__() self.fc = nn.Linear(n_states, n_actions) def forward(self, x): return torch.softmax(self.fc(x), dim=-1) def sample_action(policy, state): state_one_hot = torch.eye(n_states)[state] action_probs = policy(state_one_hot) return torch.multinomial(action_probs, 1).item() env = gym.make('FrozenLake-v0', is_slippery=False) n_states = env.observation_space.n n_actions = env.action_space.n policy = PolicyNet(n_states, n_actions) optimizer = optim.Adam(policy.parameters(), lr=0.01) for episode in range(500): episode_rewards = [] episode_log_probs = [] state = env.
...
October 12, 2023
import numpy as np # Define the environment n_states = 16 # For the 4x4 grid n_actions = 4 # Up, Down, Left, Right # Initialize the Q-table with zeros Q = np.zeros((n_states, n_actions)) frozen_lake_list = [ 'SFFF', 'FHFH', 'FFFH', 'HFFG' ] frozen_lake = np.array([list(row) for row in frozen_lake_list]) def state_to_coord(state): """Convert a state number to a coordinate (i, j) on the grid.""" return divmod(state, 4) def coord_to_state(coord): """Convert a grid coordinate (i, j) to a state number.
...
October 12, 2023
You:
art of war
what does it mean
if the campaign is protracted, the resources of the State will not be equal to the strain
ChatGPT:
The quote “If the campaign is protracted, the resources of the State will not be equal to the strain” is from “The Art of War,” a classical Chinese military treatise written by Sun Tzu more than 2,500 years ago. The text is a comprehensive guide on strategic warfare, and it has been studied and referenced by military leaders, business executives, and scholars throughout history.
...
October 12, 2023
You:
what does it mean for a goverment to have debt? (is it different from country having a debt?)
ChatGPT:
When we talk about a government having debt or a country having debt, in most contexts, we’re referring to the same thing: the national debt. However, the nuances in phrasing can sometimes imply specific distinctions. Here’s a breakdown:
Government Debt: This usually refers to the total amount of money that the government owes to external creditors and domestic lenders.
...
October 12, 2023
You:
what is book value of common stock?
ChatGPT:
The book value of common stock, often simply called “book value,” represents the value of a company’s common equity as per its balance sheet. It’s essentially the difference between a company’s total assets and its total liabilities. Here’s how you can calculate it:
Book Value of Common Stock = Total Assets - Total Liabilities - Preferred Stock Equity
(If the company has issued preferred stock, its equity value needs to be subtracted to get the book value that pertains only to the common stockholders.
...
October 12, 2023
You:
Y = C + I + G
GDP = Consumption + Investement + Goverment purchase for closed enconomy
We can manipulate the equation to get Y - C - G = I
left side shows, Total output - total consumption And we call it national saving or saving, in short
if we denote the value by S It becomes S = I
i’m paraphrasing a economy text
ChatGPT:
...
October 12, 2023
You:
people recommend to use multiple type of investment I want to ask mainly about stock / bonds
what are the key decision point to choose one over another and general indication that can foresee either one is becoming favorable in near future and what are the key factors to consider when purchasing bonds (i think it’s different from stock where value changes according to market, bond value (would probably be affected by market, but i guess the mechanism or influencing factor is different) is determined by ?
...
October 12, 2023
You:
plz describe how money is made in US
specifically, I wanna know the conceptual steps that the following entities is taking, and how money (digital) flows and how it’s book keeped (conceptually , i guess they must confirm to accounting process, so transfering money from a to b, would result in two conceptual accounting notes in a and b)
US goverment FDR Banks
I’m asking how , money is created and put into the economic system.
...
October 12, 2023
You:
tell me about P/E
ChatGPT:
Certainly! The P/E ratio, or Price-to-Earnings ratio, is a widely used metric in the finance and investment world. It’s used to evaluate the valuation of a company. Here’s an overview:
Definition: P/E ratio is calculated by dividing the market price per share of a company by its earnings per share (EPS) for a specific period, typically one year.
P/E=Market Price Per ShareEarnings Per Share(EPS)P/E = \frac{Market: Price: Per: Share}{Earnings: Per: Share (EPS)}P/E=EarningsPerShare(EPS)MarketPricePerShare
...
October 12, 2023
You:
pandas session,
i was learning crosstab
pivot_table seems related, can you show me simple example of pivot_table
ChatGPT:
Certainly! Both crosstab and pivot_table in Pandas allow you to reshape data and perform aggregations, but they’re designed with slightly different use cases in mind.
Here’s a simple example to help you understand the pivot_table:
Let’s consider a dataset of sales data:
import pandas as pd data = { 'Date': ['2023-10-01', '2023-10-01', '2023-10-02', '2023-10-02', '2023-10-03'], 'Product': ['A', 'B', 'A', 'A', 'B'], 'Sales': [100, 150, 200, 50, 300] } df = pd.
...
October 12, 2023
You:
how can you trim string data in pandas dataframe
ChatGPT:
In pandas, you can trim leading and trailing whitespaces from a string in a DataFrame using the str.strip() method.
Here’s a step-by-step guide on how to trim string data in a pandas DataFrame:
First, you need to import pandas: Copy code import pandas as pd Create a sample DataFrame: Copy code df = pd.DataFrame({'A': [' foo ', ' bar ', ' baz '], 'B': [' alpha ', ' beta ', ' gamma ']}) print(df) Output:
...
October 12, 2023
(original) https://chat.openai.com/share/cd0a2f47-fafd-4e73-a545-193e0b69bb65
You:
i wanna know about reinforcement learning
let’s start from bigger picture, but with concrete example.
ChatGPT:
Certainly! Reinforcement Learning (RL) is a type of machine learning where an agent learns to make decisions by taking actions in an environment to maximize some notion of cumulative reward.
Let’s illustrate this with a classic example: The FrozenLake Environment.
Scenario: Imagine a frozen lake that’s divided into a grid. The agent starts in the top-left corner and its goal is to reach the bottom-right corner where there’s a frisbee.
...
October 12, 2023
You:
i’m trying to understand the following explanation about softmax can you sketch the steps that needs to be studied for 1-2 year college students?
Mathematical Justification: The softmax function can be seen as the gradient log-normalizer of the categorical distribution in the context of log-linear models. This is a bit more advanced, but if you’re familiar with maximum entropy models and the principle of maximum likelihood, the softmax function arises naturally when you try to maximize the likelihood of a log-linear model subject to empirical constraints.
...
October 1, 2023
(original) https://chat.openai.com/share/fdc20be3-d70b-4c98-8815-308ae676739c
You:
in stat, when we estimate y from x, with y = f(x) we assume there’s error term such tat y = f(x) + e and we also assume e is mean 0 why do we make the assumption? do we adjust our estimator f(what’s the proper term for f?) to make e to have 0 mean?
ChatGPT:
Yes, when we use a model of the form y=f(x)+ey = f(x) + ey=f(x)+e, the term eee represents the error or residual.
...
You:
what is “standard” matrix for linear map ?
ChatGPT:
The “standard matrix” of a linear map refers to the matrix representation of that map with respect to the standard bases of the domain and codomain vector spaces.
To be more specific, let T:Rn→RmT: \mathbb{R}^n \to \mathbb{R}^mT:Rn→Rm be a linear transformation. Let e1,e2,…,ene_1, e_2, \dots, e_ne1,e2,…,en be the standard basis for Rn\mathbb{R}^nRn where eie_iei is the vector with a 1 in the i-th position and zeros elsewhere.
...