KunH's Blog

Welcome

Posted on 2025-02-15

Hi, welcome to my blog😊

RL Algorithm: RLHF & DPO

Posted on 2025-02-15 In Algorithm

Reinforcement Learning (RL) algorithms for LLM alignment with human preferences: RL from Human Feedback (RLHF) and Directed Preference Optimization (DPO).

Linear Algebra Note#2：Elimination

Posted on 2025-02-11 In Note

This note is based on MIT 18.06📒

RL Algorithm: PPO

Posted on 2025-02-10 In Algorithm

RL foundations and Proximal Policy Optimization (PPO) Algorithm

Based on PPO by RethinkFun📒

Linear Algebra Note#1：Row, Column & Matrix

Posted on 2024-06-11 In Note

This note is based on MIT 18.06📒

Python类的多继承以及super()机制

Posted on 2022-02-03 In Code

Python支持类的多继承，通过super()方法实现对不同父类的访问。

Git Note#2：Command

Posted on 2021-12-05 In Note

Git系列学习笔记基于尚硅谷Git及Github教程📒

介绍Git本地库操作，冲突处理，远程库操作；Git跨团队协作以及SSH密钥。

Git Note#1：Introduction

Posted on 2021-12-02 In Note

Git系列学习笔记基于尚硅谷Git及Github教程📒

介绍Git基本结构，Git版本管理及分支管理原理。

ML Note#2：Linear Models(2)

Posted on 2021-10-23 In Note

ML系列学习笔记基于吴恩达教授的斯坦福CS229 2018课程📒

本文扩展了线性回归模型：引入局部权重回归以及第一个分类模型——逻辑斯特回归；证明为什么使用最小平方误差作为优化目标；介绍一种新的参数优化方法——牛顿法。

ML Note#1：Linear Models(1)

Posted on 2021-10-22 In Note

ML系列学习笔记基于吴恩达教授的斯坦福CS229 2018课程📒

本文介绍首个模型——线性回归，展示机器学习基本步骤；回顾（预习）线性代数相关知识，并基于几何角度重新理解线性回归中的模型参数估计过程。