人X染体长臂_Xq_和短臂_Xp_基因组学比较分析_英文__cropped

遗传学报Acta G enetica Sinica , J a nu ary 2005 , 32 (1) : 1～10 ISS N 0379 - 4172

Ge n o m e S e q u e n c e C o m p a r a t i ve

a n d S h o rt Ar m of Hu m a n A n al ys i s of L o n g Ar m X C h r o m o s o m e

L ΒZhan2J u n ①, SON G Shu2Xia , ZH A I Y u , H OU J ie ,H AN Li2Zhi , WAN G Xiu2Fang

( Department of L a boratory Animal , Hebei Medical University , S h ijiazhuang 050017 , China)

A b s t r a c t : 30 % of the ge n e s t e st e d o n X p e scap e d inactivatio n ,where a s le s s t ha n 3 % of t he ge n e s o n Xq e scap e d in2

activatio n. To inve stigat e t he molecu lar mecha n i s m involve d in t he p rop a g atio n a n d maint e n a n ce of X chro m o so m e inac2 tivatio n a n d e scap e ,t he lo n g arm a n d sho rt arm o f t he X chro m o so m e were co m p a re d fo r RNA b inding de n s ity. Nu2 cleotide se q ue n ce s o n t he X chro m o so m e were d ivide d into 50 kb p e r se g me n t t hat wa s reco rde d a s a set of fre q ue n cy value s of 72nucleotide (7 nt) string s u sin g all po s sib le 7 nt string s(47 = 16 384) . 120 ge n e s highly exp re s s e d in t he to n s il germinal ce n ter B cell s were select e d fo r calcu lating t he 7 nt string fre q ue n cy value s of all intro n s(intro n 7nt) . In tro n 7n t wa s co n s idere d RNAs ( R NA pop

u latio n) that simulat e d t he tot al of small RNA fra gme n t s in cell s. K no win g t he 7 n t fre2 que n cy value s of DNA se gme n t s a n d t he intro n 7n t ,we ca n calcu lat e t he b inding de n s ity of DNA se g me n t s to t he intro n 7nt t hat wa s t erme d a s RNA binding de n sity. The RNA b in d in g de n sity wa s det ermine d by t he a mo unt of co m ple m e n t se que nce s. The mo re a mo unt of co mple me nt se que nce s , t he mo re de n sity of RNA b in ding. The RNA b inding d e n s i t y sim ulat e d t he total of small RNA fra gme nt s b o un d to the DNA se gme nt . Several p rincip al charact eri stic s were o b serve d fo r t he fir st time : (1) The me a n value of RNA b in ding de n sity of DNA se gme nt s o n X p wa s significa ntly higher tha n that o n Xq ( P < 01001) ; ( 2) The number s of DNA se gme n t s highly b in d in g RNAs were mo re o n X p tha n o n Xq ( P < 01001) ; (3) The clu s t er s o f RNA highly b in d in g DNA se gme n t s were a s s o c iat e d w it h re g io n s in which ge n e s e scap e in2 activatio n. I t ha s be e n sugge st e d t hat RNAs activat e ge ne s a nd t he interactio n of RNA2DNA in cell s are ext e n s ive ,for exa m ple , R NAs incre a se DNa se Ⅰse n s itivity of DNA , t here i s ple n ty of no n p rot ein2co d in g RNAs in cell s , t he b ind in g sp ecificity of DNA2RNA i s f ar higher t ha n t hat of DNA2p rot ein a nd t he affinity o f DNA wit h RNA i s incre a se d , a s co m2 p are d wit h DNA. The no n ra nd o m p rop e r tie s of d i stribu tio n of RNA highly b in d in g se gme nt s bet we e n X p a n d Xq , c o m2

b ine d wit h t he find ing of RNA activating ge ne s ,p ro vide a stro n g evide nce t hat RNA highly b in ding se gme nt s may serve

a s DNA signal s to p rop a gat e activatio n alo ng a chro mo so me a nd vice ver sa ,t he DNA se gme nt s t hat le s s

b ind RNAs

may sile n ce t he ge n e s .

Ke y w o r d s : X p ; Xq ; inactivatio n e scap e ; intro n RNA ; nucleotide string

收稿日期:2004 - 03 - 08 ；修回日期:2004 -09 -02

作者简介：吕占军(1952 - )男，博士生导师，河北省免疫学会副理事长，研究方向：衰老、分化和肿瘤发生理论及抗衰老、抗肿瘤对策研究

①通讯作者。E2mail:***************;T el :0311******** ,0311********

人X 染体长臂(Xq)和短臂(Xp )

基因组学比较分析

吕占军①, 宋淑霞, 翟羽, 侯杰, 韩丽枝, 王秀芳

( 河北医科大学实验动物学部, 石家庄050017)

摘要: X 染体发生X 染体失活，但是X p 基因有30 %表现为逃逸，而Xq 仅不到3 % 。为了研究X 染体基因失活和表达逃逸发生和维持的分子机制，比较了Xq 和X p DNA 序列的RNA 模拟结合强度。X 染体的核苷酸序列被分为50 kb 一段, 对每一段DNA 做7 碱基(7 n t) 字符串组合分析(共有47 = 16 384 种组合) ，记录每段50 kb DNA 中每种7 nt 字符串的频率。选择生发中心B 细胞中的120 个高表达基因，计算这些基因的内含子7 nt 字符串的出现频率，称为intron 7n t ，以此作为R NAs ( RNA ，模拟细胞中R NA 在小片段的总和) 。已知一段DNA 序列的7 n t 频率值和intron 7nt ，即可以计算该DNA 段与intron 7n t 的结合强度。每段50 kb DNA 与intr on 7n t 的结合强度取决于该DNA 段与intron 7n t 互补核苷酸的频率，互补的核苷酸序列越多，结合强度就越大。DNA 段与intron 7n t 的模拟结合强度称为RNA 结合强度，试图模拟该段DNA 可以结合的R NA 小片段的总量。之所以采用7 nt 字符串组合分析是考虑到连续7 个核苷酸互补则可以形成相对稳定的结合。研究发现: 1) X p 各DNA 段的RNA 结合强度均值显著大于Xq ( P < 01001) ;2) X p 上高结合R NA 的DNA 段数目显著高于Xq ( P < 01001) ;3) RNA高结合DNA 段形成的簇与X 染体基因表达逃逸区关联。有证据表明,RNA 可以

通过改变染质构象活化基因并且该作用具有普遍意义：如R NA 增加染质对DNase Ⅰ消化的敏感性，互补R NA2DNA 的亲和性高于互补DNA2DNA ，细胞中有丰富的非编码R NA 和非编码DNA 等。研究中的发现结合R NA 活化基因的观点，提示X p 逃逸失活基因的数目多于Xq 可能与前者的R NA 结合强度大于后者有关。

关键词: X 染体长臂; X 染体短臂; 失活逃逸; 内含子R NA ; 核苷酸字符串

中图分类号: Q347 文献标识码: A 文章编号: 037924172 (2005) 0120001210

X chrom osom e inactivati on ( XCI) is the process whereby one of the tw o X chrom osom es in norm al dipl oid fem ale cells is inactivated t o com pensate for the dosage di fference of X2linked genes between m ales and fem ales. One of the m ost intriguing aspects of X inactivati on in hu2 m ans is that certain genes have been found that escape in2 activati on and are ex pressed from both X chrom osom es1 . The genes that escape inactivati on ( ex pressed from both the active and inactive X chrom osom es) are nonrandom ly distributed ,with the m aj ority of such transcripts m apping to the short arm on X chrom osom e and not to the l ong

Although the basis for the ex pressi on of these genes from the inactive X chrom osom e is unclear at present , their study is likely to be inform ative for understanding the chrom osom al m ec

hanism s involved in X inactivati on , im2 plying the existence of l ocal and/ or chrom osom al signals that distinguish genes that escape inactivati on from those that are subjected to inactivati on.

T o investigate the m olecular mechanism involved in the propagati on and m aintenance of X chrom osom e inacti2 vati on and escape ,the l ong arm and short arm of X chro2 m osom e were com pared for RNA binding density being a com puter sim ulati on of binding density of DNA segm ents and R NAs at 7 nt string level .

arm2 ,3. 30 % of the genes tested on X p escaped inactiva2 ti on ,whereas less than 3 % of the genes on Xq escaped inactivati on4 .

L Β Zhan 2J u n et al . : G

enome S equence C omparative Analysis of Long Arm 3

1 . 3 S e q u e n c e r e p r e s e nt a ti o n 1 M aterials and M ethods

1 . 3 . 1 DNA sequences

DN A sequences on X p (1～58 Mb ) were divided into 1 144 segm ents that include 33 incom plete segm ents ( 33/ 1

144 ,2. 88 %) and 1 111 com plete segm ents. DN A sequences on X q (60～153 Mb ) were divided into 1853 segm ents that

include 40 incom plete segm ents (40/ 1 853 ,2116 %) and 1 813 com plete segm ents. E very com plete segm ent (50 kb ) was recorded as frequency values of 7 nt strings using all possi 2 ble 7 nt strings ( 47

= 16 384) . The incom plete segm ents of which nucleotides were known m uch m ore than 10 % were selected for the foll owing count ; less than 10 %

known nucleotides were not counted. The m ethods of

counting for incom plete segm ents : search for the frequen 2

cy values of 7nt string , sum up the values ( = num ber of

nucleotides 26) then divided by 49 994 ( the sum of fre 2

quency value of 50 kb ) , the results pr ovided the coeffi 2

cient of those incom plete segm ents. The frequency value

multiplied by the coefficient gives the adjusted value of the incom plete segm ent . The available segments on X p were 1 133 and on Xq 1 841.

1 . 3 .

免清洗助焊剂2 R NA sequences

120 genes highly ex pressed in the tonsil germinal

center B cells were selected for calculating the 7 nt string

frequency values of all introns (from sense strand ) . E ach

intron sequence was recorded initially as a set of 7 nt fre 2

quency values. The sum of the 7 nt frequency values on

the sam e 72nucleotide string of all introns within the sam e

gene m ultiplied by the ex pressi on frequency of the gene gave the intron 7nt frequency values of this gene . The sum of the intron 7 nt frequency values of 120 genes ( intron

7nt ) was regarded as a sim ulati on of RNA fragm ents in

cells.

1 . 1 S e q u e n c e d a t a

were ob 2 Nucleotide sequences of X chrom osom e tained from the NC B I genom e database ( http :/ / w ww. ncbi . nlm. nih. gov/ genom e/ guide ) . Based on results of Digital Differential Display ( DDD ) ( h ttp :/ / www. ncbi . nlm. nih. gov/ Uni G ene . ddd. cgi ) , 120 genes highly ex 2 pressed in tonsil germinal center B cells were selected. They are : R PL 13 , YY 1 , G L TS C R 2 , KIAA 0217 , IN P P 5 D , NCF 1 , N CUB E 1 , PTB P 1 , MLL , TCF 3 , BAC H 2 ,

Y WH AQ , UB E 2 H , CGG B P 1 , CDC 2 , GG A 2 , S ER P 1 ,

EG LN 2 , DUT , B CL 2 L 12 , WSB 1 , PTEN , MBNL , P AX 5 ,

B C L 11 A , FTH 1 , SMC 4 L 1 , CSN K 1 A 1 , OSB P L 8 , WH 2

S C 1 , ALOX 5 , KIAA 0084 , Z F P 91 , KRAS 2 , FB P 17 , ZN F 265 ,

F U S I P 1 ,

FOX P 1 , CYorf 15 B ,

UGCG ,

C AM K 2

D , C 9 or f 5 , C LS TN 1 , DC 8 , CEN TB 2 , N KTR ,

S TK 39 , R ER E , PS P 1 , PB P , MB P , DA PP 1 , FLJ 11273 ,

KIAA 1323 , NA P 1 L 1 , RAS GR P 1 , CPN E 3 , UN C 93 B 1 ,

KIAA 1033 , A RS 2 , UB QLN 1 , L YN , TOMM 20 2PEN DIN G , KIAA 0746 , PB 1 , M F N G , H S PC A , EI F 4 EB P 2 , G LS ,

F LJ 22301 ,

EHD 1 ,

E L

F 1 ,

KIAA 1268 ,

OA ZIN , FLJ 10342 , CEP 1 , BART 1 , B TF , FLJ 20333 , RCOR ,

G DI 2 ,

F LJ 10407 , A P L P 2 ,

HN R P H 1 , MGC 4796 ,

CAS P 8 , PTPR CA P , HRB 2 , PR K ACB , M E F 2 B , N OL C 1 , L Z 16 , CAS T , ADD 3 , A KA P 13 , A E S , FLJ 10392 , FLJ 20085 ,

PS C D 1 ,

EI F 2 A K 3 , DDX 18 ,

CYBB ,

NA P 1 L 4 , PPP 3 CC , FLJ 10707 , CHER P , KIAA 0494 ,

DM TF 1 , R ER 1 , M YBL 1 , F AN C A , H SD 17 B 12 , CB X 3 , G NAS , NU P 153 , RANB P 2 , JJA Z 1 , A TM , IC A P 21 A and

NU P 88. 1 . 2 S of t w a r e

1 . 4 RNA bi n di n g d e n s i t y of DNA s e q u e n c e s

The search software used for analysis of frequency values of 72nucleotide string was written by the staff in our research team. The 50 kb DNA sequence w ould be repre 2 sented by a l ong colum n of num bers whose sum was 49 994 . MS E XCE L software was used for statistical analysis.

K nowing the 7 nt frequency values of DNA segm ents

and the intron 7 nt ,we can calculate the binding density

of DNA segm ents to the intron 7 nt . Intron 7 nt is R NA ex cept for T that substitutes for U. The binding density of

DNA segm ents to intron 7 nt was determined by the

am ount of com plem ent sequences. Intron 7 nt was m ulti 2 plied by 7nt frequency values of 50 kb DNA segm ents on the sam e row , and the sum of the products in the sam e colum n was intron 7nt binding density of this DNA seg 2 m ent :

E 16 384 = F1 + F2 + F3 + sity of intron 7nt to DNA

+ F16 384 = Binding den 2

This value sim ulates the total am ount of RNA frag 2

m ents binding to the DNA segm ent . C alculati on of the binding density of DNA segm ents to the intron 7 nt ( RNA

binding density ) is shown in T able 1 .

C1 ×E1 + C2 ×E 2 + C3 ×E3 +

+ C 16 384 ×

桩基泥浆比重

Ta bl e 1 Cal c ul a t i o n of t h e bi n di n g d e n s i t y of DNA s e g m e n t s t o t h e i n t r o n 7 n t #

模拟温度传感器B

C D

E F nt string f requency values of 50 kb D N A segment 7

I ntr on 7nt C ×E

142027. 28 # # 840. 10 # # 1630 . 68 # # 1 2

5′2A A A A A AA 23′3′2TTTTTTT 25′

5′2A A A A AAC 23′3′2TTTTTTG 25′ 5′2A A A A AAG 23′3′2TTTTTTC 25′

152 10

12 5′2A A A A A AA 23′

5′2A A A A AAC 23′ 5′2A A A A AAG 23′

934 . 39 84. 01

135 . 89 5′2TTTTTTT 23′3′2A A A A A AA 25′ 5′2TTTTTTT 23′

263692. 20 # #

Sum # # # ( RN A binding density )

16384

186

1417 . 7

# : This is an excel table and shows the calculation method of RN A binding density in a segment of 50 kb D N A. # # : C1 multiplied by E1 ,C2 multiplied by E2 ,C3 multiplied by E3 ,C16 384 multiplied by E16 384 ,respectively. # # # : ΣF12F16384 = RN A binding density of the DN A segment .

The nucleotide sequences on intron 7nt are com ple 2 m entary , for ex am ple , the frequency val

ue of 5′2 AAAAAAA 23′is cl ose to that of 5′2TTTTTTT 23′. S o the binding density of intron 7nt to 5′→3′DNA strand is similar to that to 3′→5′DNA strand. In this paper the re 2 sults of single strand DNA are reported.

( T able 2) .

X p and Xq were com pared for the num bers of seg 2

m ents that can highly bind RNAs ( high RNA binding

density ) . In all fragm ents from > 3 000 000 to > 2 520 000 ,the num bers of DNA segm ents that highly bind R NAs on X p are significantly higher than those on Xq ( P < 0 . 001 ,T able 3) .

2 Results

2 . 2 RNA bi n di n g d e n s i t y of DNA s e g m e nt s i n

e s c a p e a n d i n a c ti v a ti o n r e gi o n s o n Xp 2 . 1

RNA b i n d i n g Xp a n d Xq

d e n s i t y of DNA s e g m e n t s o n The regi ons in which escape genes centraliz ed were

separated into escape regi ons ( escape inactivati on re 2

gi ons ) ,the other regi ons were considered as inactivati on

regi ons ( Fig. 1) . Mean value of RNA binding density of

DNA segm ents in escape regi ons was signi ficantly higher

than that in inactivati on regi ons ( P < 01001 , T able 4) . T w o escape genes ( eI F 22 gamm a and SMCX ) in inactiva 2

ti on regi ons were l ocated within the RNA highly binding segm ents ( F ig. 1) .

50 kb DNA was considered as one segm ent ,therefore DNA sequences on X p and Xq were divided into 1 133 segm ents and 1 841 available segm ents respectively. The 7 nt frequency values of each segm ent were multiplied by intron 7nt on sam e string. Sum of the products in the sam e colum n was the R NA binding density. The binding density of RNAs and every DNA segm ent was calculated ( Fig. 1 , Fig. 2) . X p was significantly higher than Xq in m ean val 2 ues of the binding density of RNAs and DNA segm ents

L ΒZhan2J u n et al .: G enome S equence C omparative Analysis of Long Arm 5

婴童车

不锈钢表面钝化Fi g . 1 Si m ul a t i o n of bi n di n g of RNA s a n d DNA s e g m e n t s o n Xp

钢骨柱

The numbers below and above horiz ontal line are genome position of1M～58M (multiplied 50 kb by) and gene numbers in those

regions respectively.

“o”: Esca pe genes. SL C25A6 (genome position = 21 . 6), DXYS155 E , AL TE, stS G15779 , MIC2 , StS G9723 , StS G1369 , ARSD , G S1 , H s. 79876 , G S2 , S EDT , CXOR F5 , IN E2 , PIR , GR P R , StS G4551 , RbAp46 , eI F22gamma , CRS P150 , D FFRX , DDX3 , IN E1 , UTX , UB E1 , PCTK1 , SMCX (genome position = 1 038) .“●”: I nactivation genes. W I217390 , MID1 , HCCS , PR P S2 , W I214561 , PIG A , RAI2 , S C ML2 , PDH A1 , IS PK21 , SMS , SA T , PD K3 , stS G8688 , POLA , GK , AA156453 , PRGP1 , RP3 , C AS K , DXS8237 E , ZN F157 , ARA F1 , E L K1 , S TS2N34520 , JM23 , EB P , RBM3 , A007 K03 , JM21 , UG A LT , Pim2H , st S G4507 , TFE3 , T54 , A4 , LMO6 , W I221198 , A007H45 , A007D27 , IB 3700 , IB772 , DXS1013 E ,FG D1 , DXS7159 E , TRO , stS G13253 , UQC RB , W I - 14025 , ZXDB , ZXDA .“3 ”: Elevated expression genes. H s. 25625 , MSL 3L1 , GPM6B , PHK A2 , MT2A C T48 , H s.

103104 , H s. 192846 , G A T A1 .

本文发布于:2024-09-22 14:25:37，感谢您对本站的认可！

本文链接：https://www.17tex.com/tex/3/252587.html

上一篇：榕树的花语含义及繁殖方法介绍

下一篇：一例X-连锁无丙种球蛋白血症患者BTK基因新变异体的鉴定

标签：基因结合强度

留言与评论（共有 0 条评论）