Thanks for Yandong's help and guidance, that I got some basic ideas about CRF (Conditional Random Filed) and how the CRF model looks like. The encoder of CRF++, crf_learn, could generate a model in text format with the '-t' option. Take the Japanese word segmentation demonstration (example/seg) as an example, the following is the model in text format:
ersion: 100
cost-factor: 1
maxid: 1386 /* the number of feature functions */
xsize: 1
B /* the tag lists, in this case, we have two tags */
I
U00:%x[-2,0] /* unigram feature templates */
U01:%x[-1,0]
U02:%x[0,0]
U03:%x[1,0]
U04:%x[2,0]
U05:%x[-2,0]/%x[-1,0]/%x[0,0]
U06:%x[-1,0]/%x[0,0]/%x[1,0]
U07:%x[0,0]/%x[1,0]/%x[2,0]
U08:%x[-1,0]/%x[0,0]
U09:%x[0,0]/%x[1,0]
B /* bigram feature template */
0 B /* bigram of the tags for C_{-1} and C_0, */
/* number of features are 2^(# of tags). */
4 U00:_B-1 /* _B-1 is the starting of a sentence */
/* _B+1 is the ending of a sentence */
6 U00:_B-2 /* _B-2 is the pre-token of _B-1 */
/* _B+2 is the post-token of _B+1 */
8 U00:
10 U00:、 /* feature function id, template id, and observation */
12 U00:〇 /* since we only have two tags, each entry could */
14 U00:「 /* be expanded to 2 feature functions */
20 U00:う
... ...
... ...
1382 U09:3/年
1384 U09:9/3
-0.0799963416235706 /* the weight for each feature function */
0.4346315510326526 /* the negative value indicates the */
-0.1044728887459596 /* feature is rarely seen, and we have */
-0.2501623206703318 /* 1386 weights in total. */
... ...