人X染体长臂_Xq_和短臂_Xp_基因组学比较分析_英文__cropped

遗传学报Acta G enetica Sinica , J a nu ary 2005 ,  32  (1) :  1~10 ISS N 0379 - 4172
Ge n o m e S e q u e n c e  C o m p a r a t i ve
a n d S h o rt Ar m  of  Hu m a n A n al ys i s  of L o n g Ar m X C h r o m o s o m e
L ΒZhan2J u n ①,  SON G Shu2Xia ,  ZH A I Y u ,  H OU J ie ,H AN Li2Zhi ,  WAN G Xiu2Fang
( Department  of L a boratory Animal ,  Hebei Medical University ,  S h ijiazhuang    050017 ,  China)
A b s t r a c t :  30 % of the ge n e s t e st e d o n X p e scap e d inactivatio n ,where a s le s s t ha n 3 % of t he ge n e s o n Xq e scap e d in2
activatio n. To inve stigat e t he molecu lar mecha n i s m involve d in t he p rop a g atio n a n d maint e n a n ce of X chro m o so m e inac2 tivatio n a n d e scap e ,t he lo n g arm a n d  sho rt  arm o f t he  X chro m o so m e were  co m p a re d fo r RNA b inding  de n s ity. Nu2 cleotide  se q ue n ce s o n t he X chro m o so m e were d ivide d into 50 kb p e r se g me n t t hat wa s reco rde d a s a set of fre q ue n cy value s of 72nucleotide  (7 nt)  string s u sin g all po s sib le 7 nt string s(47  = 16 384) . 120 ge n e s highly exp re s s e d in t he to n s il germinal ce n ter B cell s were select e d fo r calcu lating t he 7 nt string fre q ue n cy value s of all intro n s(intro n 7nt) . In tro n 7n t wa s co n s idere d RNAs ( R NA pop
u latio n)  that simulat e d t he tot al of small RNA fra gme n t s in cell s. K no win g t he 7 n t fre2 que n cy value s of DNA se gme n t s a n d t he intro n 7n t ,we ca n calcu lat e t he b inding de n s ity of DNA se g me n t s to t he intro n 7nt t hat wa s t erme d a s RNA binding de n sity. The  RNA b in d in g de n sity wa s det ermine d by t he  a mo unt of co m ple m e n t se que nce s. The mo re  a mo unt of  co mple me nt  se que nce s , t he  mo re  de n sity of  RNA b in ding. The  RNA b inding  d e n s i t y sim ulat e d t he total of small RNA fra gme nt s b o un d to the DNA se gme nt . Several p rincip al charact eri stic s were o b serve d fo r t he fir st time : (1)  The me a n value of RNA b in ding de n sity of DNA se gme nt s o n X p wa s significa ntly higher tha n that o n Xq ( P < 01001) ;  ( 2)  The  number s of DNA  se gme n t s  highly b in d in g  RNAs were  mo re  o n X p  tha n o n Xq ( P < 01001) ;  (3)  The clu s t er s o f RNA highly b in d in g DNA se gme n t s were a s s o c iat e d w it h re g io n s in which ge n e s e scap e in2 activatio n. I t ha s be e n sugge st e d t hat  RNAs activat e  ge ne s a nd t he interactio n of  RNA2DNA in cell s are  ext e n s ive ,for exa m ple , R NAs incre a se  DNa se Ⅰse n s itivity of  DNA , t here  i s ple n ty of  no n p rot ein2co d in g  RNAs in  cell s , t he  b ind in g sp ecificity of DNA2RNA i s f ar higher t ha n t hat of DNA2p rot ein a nd t he  affinity o f DNA wit h RNA i s incre a se d , a s co m2 p are d wit h DNA. The  no n ra nd o m p rop e r tie s of  d i stribu tio n of  RNA highly b in d in g  se gme nt s bet we e n X p  a n d Xq , c o m2
b ine d wit h t he find ing of RNA activating ge ne s ,p ro vide a  stro n g evide nce t hat RNA highly b in ding se gme nt s may serve
a s DNA signal s to p rop a gat e  activatio n alo ng a chro mo so me  a nd vice  ver sa ,t he DNA se gme nt s t hat le s s
b ind  RNAs
may sile n ce t he ge n e s .
Ke y w o r d s :  X p ; Xq ; inactivatio n e scap e ; intro n RNA ; nucleotide  string
收稿日期:2004 - 03 - 08 ;修回日期:2004 -09 -02
作者简介:吕占军(1952 - )男,博士生导师,河北省免疫学会副理事长,研究方向:衰老、分化和肿瘤发生理论及抗衰老、抗肿瘤对策研究
①通讯作者。E2mail:***************;T el :0311******** ,0311********
人X 染体长臂(Xq)和短臂(Xp )
基因组学比较分析
吕占军①,  宋淑霞,  翟羽,  侯杰,  韩丽枝,  王秀芳
( 河北医科大学实验动物学部,  石家庄050017)
摘要:  X 染体发生X 染体失活,但是X p 基因有30 %表现为逃逸,而Xq 仅不到3 % 。为了研究X 染体基因失活和表达逃逸发生和维持的分子机制,比较了Xq 和X p DNA 序列的RNA 模拟结合强度。X 染体的核苷酸序列被分为50 kb 一段, 对每一段DNA 做7 碱基(7 n t) 字符串组合分析(共有47  = 16 384 种组合) ,记录每段50 kb DNA 中每种7 nt 字符串的频率。选择生发中心B 细胞中的120 个高表达基因,计算这些基因的内含子7 nt 字符串的出现频率,称为intron 7n t ,以此作为R NAs ( RNA ,模拟细胞中R NA 在小片段的总和) 。已知一段DNA 序列的7 n t 频率值和intron 7nt ,即可以计算该DNA 段与intron 7n t 的结合强度。每段50 kb DNA 与intr on 7n t 的结合强度取决于该DNA 段与intron 7n t 互补核苷酸的频率,互补的核苷酸序列越多,结合强度就越大。DNA 段与intron 7n t 的模拟结合强度称为RNA 结合强度,试图模拟该段DNA 可以结合的R NA 小片段的总量。之所以采用7 nt 字符串组合分析是考虑到连续7 个核苷酸互补则可以形成相对稳定的结合。研究发现: 1) X p 各DNA 段的RNA 结合强度均值显著大于Xq ( P < 01001) ;2) X p 上高结合R NA 的DNA 段数目显著高于Xq ( P < 01001) ;3)  RNA高结合DNA 段形成的簇与X 染体基因表达逃逸区关联。有证据表明,RNA 可以
通过改变染质构象活化基因并且该作用具有普遍意义:如R NA 增加染质对DNase Ⅰ消化的敏感性,互补R NA2DNA 的亲和性高于互补DNA2DNA ,细胞中有丰富的非编码R NA 和非编码DNA 等。研究中的发现结合R NA 活化基因的观点,提示X p 逃逸失活基因的数目多于Xq 可能与前者的R NA 结合强度大于后者有关。
关键词:  X 染体长臂;  X 染体短臂;  失活逃逸;  内含子R NA ;  核苷酸字符串
中图分类号:  Q347 文献标识码:  A 文章编号:  037924172 (2005) 0120001210
X chrom osom e  inactivati on  ( XCI)  is  the  process whereby one of the tw o X chrom osom es in norm al dipl oid fem ale cells is inactivated t o com pensate for the dosage di fference of X2linked genes between m ales and fem ales. One of the m ost intriguing aspects of X inactivati on in hu2 m ans is that certain genes have been found that escape in2 activati on and are ex pressed from both X chrom osom es1    . The genes that escape inactivati on  ( ex pressed from both the active and inactive X chrom osom es)  are nonrandom ly distributed ,with the m aj ority of  such transcripts m apping to the  short  arm on  X chrom osom e  and  not  to  the l ong
Although the basis for the ex pressi on of these genes from the  inactive  X chrom osom e  is  unclear  at  present , their study is likely to be inform ative for understanding the chrom osom al m ec
hanism s involved in X inactivati on , im2 plying the existence of l ocal and/ or chrom osom al  signals that distinguish genes that escape inactivati on from those that are subjected to inactivati on.
T o investigate the m olecular mechanism involved in the propagati on and m aintenance of X chrom osom e inacti2 vati on and escape ,the l ong arm and short arm of X chro2 m osom e were com pared for RNA binding density being a com puter sim ulati on of binding density of DNA segm ents and R NAs at 7 nt string level .
arm2 ,3. 30 % of the genes tested on X p escaped inactiva2 ti on ,whereas less than 3 % of the genes on Xq escaped inactivati on4    .
L Β Zhan 2J u n et al . : G
enome S equence C omparative Analysis of Long Arm    3
1 . 3 S  e q  u  e n  c  e  r  e  p  r  e  s  e nt  a  ti  o  n  1 M aterials and M ethods
1 . 3 . 1    DNA sequences
DN A sequences on X p (1~58 Mb ) were divided into 1 144 segm ents that  include  33 incom plete  segm ents  ( 33/ 1
144 ,2. 88 %)  and 1 111 com plete segm ents. DN A sequences on X q (60~153 Mb ) were divided into 1853 segm ents that
include 40 incom plete segm ents  (40/ 1 853 ,2116 %)  and 1 813 com plete segm ents. E very com plete segm ent (50 kb ) was recorded as frequency values of 7 nt strings using all possi 2 ble 7 nt strings  ( 47
= 16 384) . The incom plete segm ents of which nucleotides were  known m uch m ore than 10 % were  selected for  the  foll owing  count ;  less  than  10 %
known nucleotides were not counted. The m ethods of
counting for incom plete segm ents : search for the frequen 2
cy values of 7nt string , sum up the values ( = num ber of
nucleotides 26)  then divided by 49 994  ( the sum of fre 2
quency value of 50 kb ) , the results pr ovided the coeffi 2
cient of  those incom plete  segm ents. The frequency value
multiplied by the coefficient  gives the adjusted value of the  incom plete  segm ent . The  available  segments on  X p were 1 133 and on Xq 1 841.
1 . 3 .
免清洗助焊剂2    R NA sequences
120 genes highly ex pressed in the tonsil germinal
center B cells were selected for calculating the 7 nt string
frequency values of all introns (from sense strand ) . E ach
intron sequence was recorded initially as a set of 7 nt fre 2
quency values. The sum of the 7 nt frequency values on
the sam e 72nucleotide string of all introns within the sam e
gene m ultiplied by the ex pressi on frequency of  the gene gave the intron 7nt frequency values of this gene . The sum of the intron 7 nt frequency values of 120 genes ( intron
7nt )  was regarded as a  sim ulati on of  RNA fragm ents in
cells.
1 . 1 S  e q  u  e n  c  e  d  a  t  a
were  ob 2 Nucleotide  sequences of  X  chrom osom e tained from the  NC B I  genom e  database  ( http :/ / w ww. ncbi . nlm. nih. gov/ genom e/ guide ) . Based on results of Digital Differential Display  ( DDD )  ( h ttp :/ / www. ncbi . nlm. nih. gov/ Uni G ene . ddd. cgi ) , 120 genes highly ex 2 pressed in tonsil germinal  center B cells were  selected. They are : R  PL  13 , YY 1 , G L  TS  C R 2 , KIAA  0217 , IN  P P 5 D , NCF 1 ,  N CUB E 1 , PTB P 1 ,  MLL ,    TCF 3 ,  BAC H 2 ,
Y WH AQ ,  UB E 2 H ,  CGG B P 1 ,  CDC 2 ,  GG A 2 ,  S ER P 1 ,
EG LN 2 , DUT , B  CL  2 L  12 , WSB  1 , PTEN , MBNL , P AX 5 ,
B C L 11 A  ,  FTH 1 , SMC 4 L  1 ,  CSN K 1 A 1 , OSB P L 8 , WH 2
S  C 1 ,  ALOX 5 ,  KIAA  0084 ,  Z  F P 91 ,  KRAS  2 ,  FB P 17 , ZN  F 265 ,
F U S I  P 1 ,
FOX  P 1 , CYorf  15 B  ,
UGCG  ,
C AM K 2
D ,  C 9 or f 5 ,  C LS TN 1 ,  DC 8 ,  CEN TB 2 ,  N KTR ,
S  TK 39 , R  ER  E , PS P 1 , PB  P , MB P , DA  PP 1 , FLJ  11273 ,
KIAA 1323 ,  NA P 1 L 1 ,  RAS GR P 1 ,  CPN E 3 ,  UN C 93 B 1 ,
KIAA  1033 , A RS  2 , UB QLN  1 , L  YN , TOMM 20 2PEN DIN G , KIAA 0746 ,  PB  1 ,  M  F N G ,  H S PC A ,  EI  F 4 EB P 2 ,  G LS ,
F LJ 22301 ,
EHD 1 ,
E L
F 1 ,
KIAA 1268 ,
OA ZIN , FLJ  10342 ,  CEP 1 ,  BART 1 ,  B  TF ,  FLJ  20333 ,  RCOR ,
G DI 2 ,
F LJ 10407 ,    A P L P 2 ,
HN R  P H 1 ,  MGC 4796 ,
CAS P 8 , PTPR CA P , HRB  2 , PR  K ACB , M  E F 2 B  , N OL  C 1 , L Z 16 ,  CAS  T  ,  ADD 3 ,  A  KA  P 13 , A  E S , FLJ  10392 , FLJ  20085 ,
PS  C D 1 ,
EI  F 2 A  K 3 , DDX 18 ,
CYBB ,
NA  P 1 L  4 ,  PPP 3 CC ,  FLJ  10707 ,  CHER  P  ,  KIAA  0494 ,
DM TF 1 ,  R ER 1 , M YBL 1 ,  F AN C A ,  H SD 17 B 12 ,  CB X 3 , G NAS , NU P 153 , RANB P 2 , JJA Z 1 , A TM , IC A P 21 A  and
NU  P 88. 1 . 2 S of  t  w  a  r  e
1 . 4  RNA  bi n  di n  g  d e n s  i t y  of  DNA  s e q u e  n c e s
The search  software  used for  analysis of frequency values of 72nucleotide string was written by the staff in our research team. The 50 kb DNA sequence w ould be repre 2 sented by a l ong colum n of  num bers whose  sum was 49 994 . MS E XCE L software was used for statistical analysis.
K nowing the 7 nt frequency values of DNA  segm ents
and the intron 7 nt ,we can calculate the binding density
of DNA  segm ents to the intron 7 nt . Intron 7 nt is R NA  ex cept for T that substitutes for U. The binding density of
DNA  segm ents  to  intron  7  nt  was  determined  by  the
am ount of com plem ent sequences. Intron 7 nt was m ulti 2 plied by 7nt frequency values of 50 kb DNA segm ents on the sam e row , and the  sum of  the products in the  sam e colum n was intron 7nt binding density of this DNA seg 2 m ent :
E 16 384 = F1 + F2 + F3 + sity of intron 7nt to DNA
+ F16 384 =  Binding den 2
This value sim ulates the total am ount of  RNA frag 2
m ents binding  to  the  DNA  segm ent . C alculati on of  the binding density of DNA segm ents to the intron 7 nt ( RNA
binding density )  is shown in T able 1 .
C1 ×E1 + C2 ×E 2 + C3 ×E3 +
+ C 16 384 ×
桩基泥浆比重
Ta bl e  1    Cal c  ul a  t i o  n  of t h e  bi n  di n  g  d e  n  s  i t y  of DNA s e g  m e  n t s  t o  t h  e  i n t r  o  n  7 n t #
A
模拟温度传感器B
C    D
E    F nt string f requency values of  50 kb D N A segment 7
I ntr on 7nt    C ×E
142027. 28 #  # 840. 10 #  # 1630 . 68 #  # 1  2
3
5′2A A A A A AA 23′3′2TTTTTTT 25′
5′2A A A A AAC 23′3′2TTTTTTG 25′ 5′2A A A A AAG 23′3′2TTTTTTC 25′
152  10
12 5′2A A A A A AA 23′
5′2A A A A AAC 23′ 5′2A A A A AAG 23′
934 . 39  84. 01
135 . 89 5′2TTTTTTT 23′3′2A A A A A AA 25′ 5′2TTTTTTT 23′
263692. 20 #  #
Sum #  #  #  ( RN A binding density )
16384
186
1417 . 7
# : This is an excel table and shows the calculation method of  RN A binding density in a segment of  50 kb D N A. #  # : C1 multiplied by E1 ,C2 multiplied by E2 ,C3 multiplied by E3 ,C16 384 multiplied by E16 384 ,respectively. #  #  # : ΣF12F16384 =  RN A binding density of  the DN A segment .
The nucleotide sequences on intron 7nt are com ple 2 m entary ,  for  ex am ple ,  the  frequency  val
ue  of  5′2 AAAAAAA 23′is cl ose to that of 5′2TTTTTTT 23′. S o the binding density of  intron 7nt  to  5′→3′DNA  strand is similar to that to 3′→5′DNA strand. In this paper the re 2 sults of single strand DNA are reported.
( T able 2) .
X p and Xq were com pared for the num bers of  seg 2
m ents that  can  highly  bind  RNAs  ( high  RNA  binding
density ) . In all fragm ents from  > 3 000 000 to  > 2 520 000 ,the num bers of DNA segm ents that highly bind R NAs on X p are significantly higher than those on Xq ( P < 0 . 001 ,T able 3) .
2  Results
2 . 2  RNA  bi n  di n  g  d e  n s  i t y  of  DNA  s e g m e  nt s  i n
e  s  c  a  p  e  a n  d  i  n  a  c  ti  v  a  ti  o  n  r  e gi  o  n  s  o  n  Xp  2 . 1
RNA  b i n  d i n  g  Xp  a n  d  Xq
d  e n  s  i t  y  of  DNA s e g  m  e n t  s  o  n  The regi ons in which escape genes centraliz ed were
separated  into  escape  regi ons  ( escape  inactivati on  re 2
gi ons ) ,the other regi ons were considered as inactivati on
regi ons ( Fig. 1) . Mean value of  RNA binding density of
DNA segm ents in escape regi ons was signi ficantly higher
than that in inactivati on regi ons  ( P < 01001 , T able 4) . T w o escape genes ( eI  F 22 gamm a and SMCX ) in inactiva 2
ti on regi ons were l ocated within the RNA highly binding segm ents ( F ig. 1) .
50 kb DNA was considered as one segm ent ,therefore DNA sequences on X p  and Xq were divided into 1 133 segm ents and 1 841 available segm ents respectively. The 7 nt frequency values of  each  segm ent were multiplied by intron 7nt on sam e string. Sum of the products in the sam e colum n was the R NA  binding density. The binding density of RNAs and every DNA segm ent was calculated ( Fig. 1 , Fig. 2) . X p was significantly higher than Xq in m ean val 2 ues of  the  binding density of  RNAs and DNA  segm ents
L ΒZhan2J u n et al .: G enome S equence C omparative Analysis of Long Arm    5
婴童车
不锈钢表面钝化Fi g .  1    Si m ul a t i o n  of  bi n di n g  of  RNA s  a n d DNA s e g m e n t s  o n Xp
钢骨柱
The numbers below and above horiz ontal line are genome position of1M~58M (multiplied 50 kb by)  and gene numbers in those
regions respectively.
“o”: Esca pe genes.  SL C25A6 (genome position = 21 . 6), DXYS155 E ,  AL TE,  stS G15779 ,  MIC2 ,  StS G9723 ,  StS G1369 ,  ARSD ,  G S1 , H s. 79876 , G S2 ,  S EDT ,  CXOR F5 ,  IN E2 ,  PIR ,  GR P R ,  StS G4551 ,  RbAp46 ,  eI F22gamma ,  CRS P150 ,  D FFRX ,  DDX3 ,  IN E1 ,  UTX ,  UB E1 ,  PCTK1 , SMCX  (genome position = 1 038) .“●”: I nactivation genes. W I217390 ,  MID1 , HCCS , PR P S2 , W I214561 , PIG A , RAI2 , S C ML2 , PDH A1 , IS PK21 , SMS , SA T , PD K3 , stS G8688 ,  POLA ,  GK ,  AA156453 ,  PRGP1 , RP3 ,  C AS K ,  DXS8237 E ,  ZN F157 ,  ARA F1 ,  E L K1 ,  S TS2N34520 , JM23 ,  EB P , RBM3 ,  A007 K03 ,  JM21 ,  UG A LT ,  Pim2H ,  st S G4507 ,  TFE3 ,  T54 ,  A4 ,  LMO6 ,  W I221198 ,  A007H45 ,  A007D27 ,  IB 3700 ,  IB772 , DXS1013 E ,FG D1 ,  DXS7159 E , TRO ,  stS G13253 , UQC RB , W I - 14025 ,  ZXDB ,  ZXDA .“3 ”: Elevated expression genes.  H s. 25625 , MSL 3L1 , GPM6B ,  PHK A2 ,  MT2A C T48 , H s.
103104 , H s. 192846 ,  G A T A1 .

本文发布于:2024-09-22 14:25:37,感谢您对本站的认可!

本文链接:https://www.17tex.com/tex/3/252587.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:基因   结合   强度
留言与评论(共有 0 条评论)
   
验证码:
Copyright ©2019-2024 Comsenz Inc.Powered by © 易纺专利技术学习网 豫ICP备2022007602号 豫公网安备41160202000603 站长QQ:729038198 关于我们 投诉建议