Commit e8d879e
deepspeed-chat: add end-of-text special token (deepspeedai#775)
Stages 1 & 2 append '<|endoftext|>' text marker to all samples.
However, some tokenizers (e.g. OPT, Bloom), encode this marker as a sequence
of subword tokens and not as a single special token.
This commit adds an optional support to add the EOT marker as a special token
to force the tokenizer to encode it as a single token.
Note that using EOT special token may change the dynamics of stage3 training.
Therefore, to be backward compliant, this commit makes it optional.
Change-Id: If98d348fcaa7d6685e755aabe305e23e7649c367
Signed-off-by: Moshe Island <misland@habana.ai>
Co-authored-by: Moshe Island <misland@habana.ai>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>1 parent f7ff9dd commit e8d879e
6 files changed
Lines changed: 80 additions & 22 deletions
File tree
- applications/DeepSpeed-Chat/training
- step1_supervised_finetuning
- step2_reward_model_finetuning
- step3_rlhf_finetuning
- utils
Lines changed: 11 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
191 | 191 | | |
192 | 192 | | |
193 | 193 | | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
194 | 199 | | |
195 | 200 | | |
196 | 201 | | |
| |||
233 | 238 | | |
234 | 239 | | |
235 | 240 | | |
236 | | - | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
237 | 247 | | |
238 | 248 | | |
239 | 249 | | |
| |||
Lines changed: 8 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
77 | 81 | | |
78 | 82 | | |
79 | 83 | | |
| |||
197 | 201 | | |
198 | 202 | | |
199 | 203 | | |
| 204 | + | |
| 205 | + | |
200 | 206 | | |
201 | | - | |
| 207 | + | |
| 208 | + | |
202 | 209 | | |
203 | 210 | | |
204 | 211 | | |
| |||
Lines changed: 10 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
201 | 201 | | |
202 | 202 | | |
203 | 203 | | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
204 | 209 | | |
205 | 210 | | |
206 | 211 | | |
| |||
238 | 243 | | |
239 | 244 | | |
240 | 245 | | |
241 | | - | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
242 | 251 | | |
243 | 252 | | |
244 | 253 | | |
| |||
Lines changed: 32 additions & 15 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
38 | 42 | | |
39 | 43 | | |
40 | 44 | | |
41 | 45 | | |
42 | | - | |
| 46 | + | |
| 47 | + | |
43 | 48 | | |
44 | | - | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
45 | 52 | | |
46 | 53 | | |
47 | 54 | | |
| |||
106 | 113 | | |
107 | 114 | | |
108 | 115 | | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
109 | 119 | | |
110 | | - | |
| 120 | + | |
| 121 | + | |
111 | 122 | | |
112 | 123 | | |
113 | 124 | | |
| |||
126 | 137 | | |
127 | 138 | | |
128 | 139 | | |
129 | | - | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
134 | | - | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
135 | 147 | | |
136 | 148 | | |
137 | 149 | | |
| |||
150 | 162 | | |
151 | 163 | | |
152 | 164 | | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
153 | 168 | | |
154 | | - | |
| 169 | + | |
| 170 | + | |
155 | 171 | | |
156 | 172 | | |
157 | 173 | | |
158 | 174 | | |
159 | 175 | | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | | - | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
165 | 182 | | |
166 | 183 | | |
167 | 184 | | |
| |||
Lines changed: 10 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
339 | 339 | | |
340 | 340 | | |
341 | 341 | | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
342 | 347 | | |
343 | 348 | | |
344 | 349 | | |
| |||
459 | 464 | | |
460 | 465 | | |
461 | 466 | | |
| 467 | + | |
| 468 | + | |
462 | 469 | | |
463 | | - | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
464 | 473 | | |
465 | 474 | | |
466 | 475 | | |
| |||
479 | 488 | | |
480 | 489 | | |
481 | 490 | | |
482 | | - | |
483 | | - | |
484 | 491 | | |
485 | 492 | | |
486 | 493 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
79 | | - | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
80 | 82 | | |
81 | 83 | | |
82 | 84 | | |
| |||
90 | 92 | | |
91 | 93 | | |
92 | 94 | | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
93 | 101 | | |
94 | 102 | | |
95 | 103 | | |
| |||
0 commit comments