Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/source/feature/data.md
Original file line number Diff line number Diff line change
Expand Up @@ -342,10 +342,10 @@ pipeline.global-job-parameters: |
### input_fields

```
input_fields: {
input_fields {
input_name: "input1"
}
input_fields: {
input_fields {
input_name: "input2"
input_type: DOUBLE
}
Expand Down
34 changes: 17 additions & 17 deletions docs/source/feature/feature.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ feature_configs {
feature_name: "cate"
expression: "item:cate"
embedding_dim: 32
zch: {
zch {
zch_size: 1000000
eviction_interval: 2
lfu {}
Expand Down Expand Up @@ -206,7 +206,7 @@ feature_configs {
Embedding特征: 支持string类型如`"0.1|0.2|0.3|0.4"`;支持ARRAY\<float>类型如`[0.1,0.2,0.3,0.4]`(建议,性能更好),配置方式如下

```
feature_configs: {
feature_configs {
raw_feature {
feature_name: "pic_emb"
expression: "item:pic_emb"
Expand All @@ -224,7 +224,7 @@ feature_configs: {
对输入的离散值进行组合(即笛卡尔积), 如age + cate:

```
feature_configs: {
feature_configs {
combo_feature {
feature_name: "combo_age_cate"
expression: ["user:age", "item:cate"]
Expand All @@ -246,7 +246,7 @@ feature_configs: {
`key`是一个多值的id,多值分隔符可以由**separator**指定,默认为`\x1d`。

```
feature_configs: {
feature_configs {
lookup_feature {
feature_name: "user_cate_cnt"
map: "user:kv_cate_cnt"
Expand Down Expand Up @@ -278,7 +278,7 @@ feature_configs: {
`match_feature`依赖`nested_map`和`pkey`和`skey`三个字段从kkv中匹配到特征值。`nested_map`是一个多值的kkv map,如`pk1^sk1:0.2,sk2:0.3,sk3:0.5|pk2^sk4:0.1`,`:`为内层kv分割符,`,`为内层多值分隔符,`^`为外层kv分割符,`|`为外层KV分隔符,分隔符不可以指定。生成特征时,使用`pkey`作为主键`skey`作为子健在`nested_map`字段所持有的kkv对中进行匹配,获取最终的特征。

```
feature_configs: {
feature_configs {
match_feature {
feature_name: "user_cate_brand_cnt"
nested_map: "user:kkv_cate_brand_cnt"
Expand Down Expand Up @@ -313,7 +313,7 @@ feature_configs: {
对数值型特征进行运算,如判断当前用户年龄是否>18,用户年龄是否符合物品年龄需求等。

```
feature_configs: {
feature_configs {
expr_feature {
feature_name: "combo_age_cate"
variables: ["user:u_age", "item:i_age"]
Expand Down Expand Up @@ -433,7 +433,7 @@ feature_configs: {
`overlap_feature`会计算`query`和`title`两个字段字词重合比例,`query`和`title`中字词的分割符默认为`\x1d`,可以用多值分隔符由**separator**指定。

```
feature_configs: {
feature_configs {
overlap_feature {
feature_name: "user_cate_cnt"
query: "user:query"
Expand Down Expand Up @@ -470,13 +470,13 @@ feature_configs: {
`tokenize_feature` 对输入字符串分词,返回分词之后的词id。支持tokenize-cpp的分词词典文件。

```
feature_configs: {
feature_configs {
tokenize_feature {
feature_name: "title_token"
expression: "item:title"
vocab_file: "tokenizer.json"
embedding_dim: 8
text_normalizer: {
text_normalizer {
norm_options: [TEXT_LOWER2UPPER, TEXT_SBC2DBC, TEXT_CHT2CHS, TEXT_FILTER]
}
}
Expand Down Expand Up @@ -509,7 +509,7 @@ feature_configs: {
计算两个key-value索引的向量的点积,或两个集合的交集的大小。

```
feature_configs: {
feature_configs {
kv_dot_product {
feature_name: "query_doc_sim"
query: "user:query"
Expand Down Expand Up @@ -545,7 +545,7 @@ feature_configs: {
通过布尔值过滤元素,类似tf.boolean_mask(tensor, mask).

```
feature_configs: {
feature_configs {
bool_mask_feature {
feature_name: "query_doc_sim"
expression: ["user:click_items", "item:is_valid"]
Expand All @@ -571,7 +571,7 @@ feature_configs: {
自定义特征,自定义方式参考[自定义算子文档](https://help.aliyun.com/zh/airec/what-is-pai-rec/user-guide/custom-feature-operator)

```
feature_configs: {
feature_configs {
custom_feature {
feature_name: "edit_distance"
operator_name: "EditDistance"
Expand Down Expand Up @@ -627,7 +627,7 @@ feature_configs: {

```
# 分组序列特征
feature_configs: {
feature_configs {
sequence_feature {
sequence_name: "click_seq"
sequence_length: 50
Expand Down Expand Up @@ -718,7 +718,7 @@ feature_configs: {

```
# 普通特征
feature_configs: {
feature_configs {
sequence_id_feature {
feature_name: "click_itemid_seq"
sequence_length: 50
Expand All @@ -728,15 +728,15 @@ feature_configs: {
hash_bucket_size: 100000
}
}
feature_configs: {
feature_configs {
sequence_raw_feature {
feature_name: "click_price_seq"
sequence_length: 50
sequence_delim: ";"
expression: "user:click_price_seq"
}
}
feature_configs: {
feature_configs {
sequence_custom_feature {
feature_name: "seq_expr_1"
operator_name: "SeqExpr"
Expand All @@ -752,7 +752,7 @@ feature_configs: {
}
}
}
feature_configs: {
feature_configs {
sequence_custom_feature {
feature_name: "seq_expr_2"
operator_name: "SeqExpr"
Expand Down
6 changes: 3 additions & 3 deletions docs/source/feature/zch.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ feature_configs {
feature_name: "cate"
expression: "item:cate"
embedding_dim: 32
zch: {
zch {
zch_size: 1000000
eviction_interval: 2
lfu {}
Expand Down Expand Up @@ -70,7 +70,7 @@ distance_lfu {
函数可支持直接用torch的tensor库来撰写,样例如下:

```
zch: {
zch {
zch_size: 1000000
eviction_interval: 2
lfu {}
Expand All @@ -81,7 +81,7 @@ zch: {
函数也可以支持调用内置函数:`dynamic_threshold_filter`, `average_threshold_filter` 和 `probabilistic_threshold_filter`,样例如下:

```
zch: {
zch {
zch_size: 1000000
eviction_interval: 2
lfu {}
Expand Down
6 changes: 3 additions & 3 deletions docs/source/models/dlrm.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,13 @@ input:
## 配置说明

```
model_config: {
feature_groups: {
model_config {
feature_groups {
group_name: 'dense'
feature_names: 'price'
wide_deep: DEEP
}
feature_groups: {
feature_groups {
group_name: 'sparse'
feature_names: 'user_id'
feature_names: 'cms_segid'
Expand Down
2 changes: 1 addition & 1 deletion docs/source/models/dssm.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ TorchEasyRec的DSSM支持运行时进行负采样,会以图存储的方式将
## 配置说明

```
data_config: {
data_config {
...
negative_sampler {
input_path: "data/test/tb_data/taobao_ad_feature_gl_v1"
Expand Down
24 changes: 12 additions & 12 deletions docs/source/models/evaluation_metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

```
model_config {
metrics: {
metrics {
auc {
thresholds: 200
}
Expand All @@ -26,7 +26,7 @@ model_config {
model_config {
dbmtl {
task_towers {
metrics: {
metrics {
auc {
thresholds: 200
}
Expand All @@ -45,7 +45,7 @@ tzrec支持在模型训练过程中对训练集在线计算模型指标,模型

```
model_config {
train_metrics: {
train_metrics {
auc {
thresholds: 200
}
Expand Down Expand Up @@ -76,7 +76,7 @@ ______________________________________________________________________

```
model_config {
metrics: {
metrics {
auc {
thresholds: 200
}
Expand Down Expand Up @@ -113,7 +113,7 @@ ______________________________________________________________________

```
model_config {
metrics: {
metrics {
multiclass_auc {
thresholds: 200
average:'macro'
Expand All @@ -137,7 +137,7 @@ ______________________________________________________________________

```
model_config {
metrics: {
metrics {
accuracy {
threshold: 0.5
top_k: 5
Expand All @@ -163,7 +163,7 @@ ______________________________________________________________________

```
model_config {
metrics: {
metrics {
recall_at_k {
top_k: 5
}
Expand All @@ -188,7 +188,7 @@ ______________________________________________________________________

```
model_config {
metrics: {
metrics {
mean_absolute_error {}
}
}
Expand All @@ -210,7 +210,7 @@ ______________________________________________________________________

```
model_config {
metrics: {
metrics {
mean_squared_error {}
}
}
Expand All @@ -229,7 +229,7 @@ ______________________________________________________________________

```
model_config {
metrics: {
metrics {
grouped_auc {
grouping_key: "user_id"
}
Expand Down Expand Up @@ -286,7 +286,7 @@ ______________________________________________________________________

```
model_config {
metrics: {
metrics {
xauc {
sample_ratio: 1e-3
max_pairs: 100
Expand All @@ -311,7 +311,7 @@ ______________________________________________________________________

```
model_config {
metrics: {
metrics {
grouped_xauc {
grouping_key: 'age'
max_pairs_per_group: 100
Expand Down
6 changes: 3 additions & 3 deletions docs/source/models/feature_group.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,16 +74,16 @@ model_config {
feature_names: "buy_50__ts"
}
sequence_encoders {
din_encoder: {
din_encoder {
input: "click_50"
attn_mlp: {
attn_mlp {
hidden_units: [128, 64]
activation: "Dice"
}
}
}
sequence_encoders {
simple_attention: {
simple_attention {
input: "buy_50"
}
}
Expand Down
8 changes: 4 additions & 4 deletions docs/source/models/pepnet.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ PEPNet的整体结构如下图所示,可以看到,核心的组件便是这
## 模型配置

```
model_config: {
feature_groups: {
model_config {
feature_groups {
group_name: 'all'
feature_names: 'user_id'
feature_names: 'cms_segid'
Expand All @@ -35,12 +35,12 @@ model_config: {
feature_names: 'price'
wide_deep: DEEP
}
feature_groups: {
feature_groups {
group_name: 'domain'
feature_names: 'occupation'
wide_deep: DEEP
}
feature_groups: {
feature_groups {
group_name: 'uia'
feature_names: 'user_id'
feature_names: 'adgroup_id'
Expand Down
4 changes: 2 additions & 2 deletions docs/source/models/rocket_launching.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@
## 配置说明

```
model_config: {
feature_groups: {
model_config {
feature_groups {
group_name: 'all'
feature_names: 'user_id'
feature_names: 'cms_segid'
Expand Down
Loading
Loading