-
Ads
Categories
Fun Stuff
While the UCD provides the language of the DOCSIS network, the Station Maintenance messaging is the proverbial “heartbeat” of the DOCSIS network. A station maintenance session consists of a Range Request sent from a cable and a Range Response sent by the CMTS. The CMTS analyzes the signal quality of the Range Request message and sends back any necessary RF adjustments in the Range Response message. This “handshake” between every cable modem and the CMTS must occur once every 30 seconds as dictated by the DOCSIS specification.
The following is an example of what a Range Request message looks like coming from a cable modem Source MAC Address (PDU SA) of 00204066A0AE to the CMTS Destination MAC Address (PDU DA) of 00070DAEC8A8:
Note: If you Google “OUI” and then the first six digits of any MAC address you will get a “MAC Address Lookup” hit that will tell you the vendor name of the equipment that you are using. For example, Google “OUI 002040″ and find out what cable modem manufacturer sent this Range Request message.
Upstream MAC type = RNG-REQ
MAC FC (HEX) = C0
MAC PARM (HEX) = 00
MAC LEN (HEX) = 001C
MAC HCS (HEX) = EA1D
PDU bytes = 28
PDU DA (HEX) = 00070DAEC8A8
PDU SA (HEX) = 00204066A0AE
PDU Type/Len (HEX)= 000A
MMH Version (HEX)= 01
MMH Type (HEX)= 04
MMH RSVD (HEX)= 00
————————————-
MMM Type: RNG-REQ
SID (HEX): 0201
Downstream Channel ID (HEX): 03
Pending Till Complete: 0
————————————-
After examining the Range Request (from here on out I will use RNG-REQ per the DOCSIS specification) you will find that it contains very little information. Clearly it must contain the type of message, which is a RNG-REQ. The source and destination MAC address so that the CMTS knows how to route the message and from which cable modem the message came . The RNG-REQ must also contain its permanent SID (temporary SID if it is just coming online) and the Downstream Channel ID (i.e. the name of downstream DOCSIS channel it is receiving data). As seen in the message, the RNG-REQ is only 28 bytes long, so this a very short message, which is important since it is sent by every cable modem at least once every 30 seconds (CMTS vendors normally reduce this time to 20 seconds or less to eliminate T4 timeouts – more on this shortly). The Pending Till Complete field is also required, since it has a value of zero (0) in this example it indicates that all previous RNG-REQ changes have been completed before this RNG-REQ was transmitted. If there is a non-zero value, then this is time estimated to complete changing new parameters from the previous Range Response. Note that only equalization can be deferred. Units are in unsigned centiseconds (10 msec). The referenced changes will now be discussed in the following section on Range Response.
Once the CMTS receives a RNG-REQ from a cable modem it must send a Range Response (from here on out I will use RNG-RSP) within 200 msec per the DOCSIS specification. If the cable modem does not receive the RNG-RSP within this interval a T3 timeout error will be generated. CMTS vendors configure their equipment to send RNG-RSP messages much faster than 200 msec to avoid CMTS-related T3 timeout errors. Before the CMTS sends the RNG-RSP it must first analyze the signal integrity of the RNG-REQ to determine if the receive level at the CMTS is correct, the cable modem transmit frequency is on center, the burst arrived at exactly the time interval the CMTS was expecting it, and if equalization is being used any parameters need to be adjusted (equalization is not enabled in the example below). Once this analysis is complete, the CMTS sends a RNG-RSP message back to the cable modem with any necessary corrections. This provides for a certain amount of “self-healing” in a DOCSIS network to make up of what are called “diurnal” or daily changes in the HFC network which change RF attenuation and gain. Additionally, subscribers may move a cable modem, eMTA, or DOCSIS enabled settop box to a new location in their home having added attenuation. Further, customer premise equipment (CPE) may age causing component drift in upstream transmitters than can be corrected to some extent by the CMTS via the RNG-RSP message. All of this will help maintain a DOCSIS network, but cable modems do have maximum limits and once they are reached troubleshooting must begin.
Let’s now analyze the content of the RNG-RSP to see the changes it may communicate to a cable modem:
Downstream MAC type = RNG-RSP
MAC FC (HEX) = C2
MAC PARM/EHDR length (HEX) = 00
MAC LEN (HEX) = 002B
MAC HCS (HEX) = A061
PDU bytes = 43
PDU DA (HEX) = 00204066A0AE
PDU SA (HEX) = 00070DAEC8A8
PDU Type/Len (HEX)= 0019
MMH Version (HEX)= 01
MMH Type (HEX)= 05
————————————-
MMM Type: RNG-RSP
SID (HEX): 0201
Upstream Channel ID (HEX): 01
————————————-
RNG-RSP TLV Type: 1
Timing Adjust: 0
————————————-
RNG-RSP TLV Type: 2
Power Level Adjust: -7
————————————-
RNG-RSP TLV Type: 3
Offset Frequency Adjust (Hz): 0
————————————-
RNG-RSP TLV Type: 5
Ranging Status: 3
————————————-
In the example above, you will first notice that the PDU DA and PDU SA MAC addresses have switched positions from the RNG-REQ example. So we are still dealing with same two pieces of equipment (same cable modem and same CMTS – did you Google the OUI yet?).
The Timing Adjust is the amount that the cable must change its transmit time in order for its bursts to arrive at the CMTS when the CMTS expects the bursts to arrive. The units are (1 / 10.24 MHz) = 97.65625 ns. A negative value means that the Ranging Offset will be decreased and the cable modem’s bursts will arrive later. Timing is quite critical in a TDMA-based system because if the bursts from one cable modem come too early or too late they could potentially collide with another modem, in which case both transmission will be corrupted and all data is lost. In this example the timing adjustment is zero (0), so the cable modem is dead on from a timing perspective.
Power Level adjustment is the next TLV (Time Length Value) parameter. This is a relative integer value that tells the cable modem to increase its transmit power when positive and decrease its transmit power when negative. The units are in 0.25 dB increments, so in the example above the adjustment is -7 * 0.25 dB = -1.75 dB. This indicates the cable modem is currently transmitting almost 2 dB too high and needs to reduce its transmit power accordingly.
The center frequency of the upstream transmitter in the cable modem is adjusted by the Offset Frequency Adjust parameter. Its units are in Hertz and in the example above no adjustment is necessary.
The Ranging Status parameter is used to tell the cable modem the status of the next station maintenance message. The available options are 1 = continue, 2 = abort, 3 = success. In the example a “3″ or “success” was transmitted, so normally ranging will continue. If a “1″ was transmitted this would indicated that major adjustments need to be made and rapid station maintenance messages would occur. An abort would tell the cable modem to re-transmit a RNG-REQ and no adjustments were being made in this RNG-RSP.
There are additional functions that can occur during station maintenance which are not included in this example which include:
I wanted to close this topic with a discussion on T3 and T4 timeouts as I have received many questions on “what causes T3 and T4 timeouts?” Fundamentally, a T3 timeout occurs when the cable modem fails to receive a RNG-RSP from the CMTS within 200 msec (after sending a RNG-REQ). A T4 timeout occurs when a cable modem fails to receive a RNG-REQ transmit grant MAP from the CMTS within 35 sec on the downstream channel (after receiving the previous RNG-RSP). There are number of reasons that these could occur and I will list a couple below for each and help guide you the root cause, but understand this is not and exhaustive list.
A final note, T3 and T4 impairments are always within the DOCSIS network inclusive of the CMTS and cable modems. In fact it is the cable modems themselves that are doing the counting and alerting of T3 and T4 timeouts. This is one time that the IP network behind the CMTS is not to blame for any of these errors. No finger pointing on this one!
bradyvolpe.com is Stephen Fry proof thanks to caching by WP Super Cache
I am working in a CATV company,we provide internet also thru DOCSIS 2. D/S freq 602.
my quest: how to cable modem lock in 602 qam freq only, we have lot of qam freq, how to cable modem lock in 602 only also up streem , how to lock only 35 mhz or 40 mhz.
[Google Translate]
The reason that your cable modems know to lock on the 602 MHz DOCSIS QAM channal and not the other QAM channels is because the DOCSIS QAM channel us transmitting Sync, UCD and MAP messages. A cable modem will lock to all 64- and 256-QAM channels, but if they do not have the three above messages, they will move on to the next QAM channel, lock and look for the messages.
On the upstream, the UCD tells the modem what upstream frequencies are available. You may have more than one upstream frequency, but often the cable modem is only physically connected to one upstream port on a CMTS, so it can only transmit on 35 MHz vs. 40 MHz (in your example). In cases where load balancing is used and all upstream ports are tied together, it is likely the CMTS that is telling the cable modem which upstream frequency to register with in order to balance the upstream capacity on the CMTS.
-Brady
[Google Translate]
Hi,
First – great blog, enormous amount of practical knowledge. Congratulations and thanks for your work!
Second – small question. Let’s consider situation when we have all upstream ports tied together. According to your other article, there is one UCD sent on the downstream channel for each upstream channel on a CMTS line card. In this example we have two upstream cards and two UCD sent. Please tell me how CM know which one UCD use to obtain parameters for transmitting data. Before CM can send first Range-Request packet it has to choice proper upstream channel, so explain me if you can, what is the procedure for choosing the correct UCD?
Best regards,
Mike
[Google Translate]
Do any CMs utilize the DOCSIS PID (0x1FFE) as defined in MPEG transport to find the downstream channel instead of SYNC-MAP-UCD?
Great site, thanks for putting it up.
[Google Translate]
is there a way to determine the SPECIFIC Upstream Frequency that a modem uses to transmit? SNMP?? MIB??
Once registered does a CM sticks to this transmit frequency?
just trying to track down a rogue modem by using a spectrum analyzer!
[Google Translate]
Kanuto,
The CMTS is the primary device that will know the upstream frequency that each cable modem is transmitting on. So the most direct route to get this information is to query via SNMP the CMTS or other backoffice system that has already retrieved the information from the CMTS. That being said, the information can still be obtained through other indirect methods via examining the DOCSIS protocol communications between the CMTS and the cable modem. There are a couple of test equipment companies in the market today that have the capability of doing this. This is very beneficial for identifying “rogue” modems as you are suggesting. One is JDSU and the other is Averna.
Regards,
-Brady
[Google Translate]
thanks but this one is very evasive and is not registering as a legit modem. a hacked one. using the moto modem.
the issue is i have which us port it is connecting and is trying to locate it physically using a dsam 6000′s spectrum analyser. i wanted to catch the subs without alerting him.
[Google Translate]
“Timing is quite critical in a TDMA-based system because if the bursts from one cable modem come too early or too late they could potentially collide with another modem, in which case both transmission will be corrupted and all data is lost. In this example the timing adjustment is zero (0), so the cable modem is dead on from a timing perspective.”
I believe that I would sleep better knowing that we have a solid timing source on the network, and am concerned that the NTP source which the CMTS and modems are synching to is running on an NT box doing things other than strictly providing timing.
Is there a debug I can run on a UBR, or someway I can glean as to whether TLV Type 1 adjustments are occuring?
[Google Translate]
Okinawajoe,
There is a “show timing adjust” command in the CMTS CLI, though that is not the exact syntax for it. In my experience with protocol analyzers on DOCSIS 1.x and 2.0 networks, I have observed cable modems transmitting as much as 300 us to 400 us outside of their designated time slots. The typical CMTS de-registration point is 500 us. So even though the cable modems are transmitting far outside of where they should be, the CMTS allows them to remain online since they are <500 us.
What does this mean for other cable modems in the network? Since the smallest timeslot or tick is 6.25 us (micro seconds) and the typical mini-slot is 12.5 us, these potentially "rogue" cable modems can be transmitting data at the same point in time that other cable modems are transmitting data. When two devices transmit data in the RF domain, the data of both devices is usually destroyed. So the cable modem transmitting out of its time slot has lost data in addition to the data of the cable modem that is doing what it is supposed to be doing. How often does this happen? In some plants I measured percentages of modems as high as 3-4%. I did not know the age of these modems, but realized that if these were highly active modems, they could do some serious damage to VoIP traffic.
I don't believe that a good timing source will fix this problem, because I could see the CMTS attempting to pull the modems back in via Station-Maintenance messages. The modems simply had problems, i.e. bad PLLs, crystals, etc. They needed to be identified and replaced. This is a clear need for troubleshooting and preventative maintenance.
Regards,
-Brady
[Google Translate]
“A final note, T3 and T4 impairments are always within the DOCSIS network inclusive of the CMTS and cable modems. In fact it is the cable modems themselves that are doing the counting and alerting of T3 and T4 timeouts. This is one time that the IP network behind the CMTS is not to blame for any of these errors. No finger pointing on this one!”
An especially encouraging comment … I am searching for the best way to analyze and qualify the quality of the DOCSIS / RF network … How can T3 and T4 timeouts be identified? Is an analyzer required? I suspect yes, as the CMTS will not be collecting the stats for timeouts which occur as a problem on the plant?
[Google Translate]
Okinawajoe,
Good observations on segmenting the RF plant from the IP plant.
But you should know that the CMTS does collect T3 and T4 stats. In addition, if you at a subscriber house you can log into the cable modem diagnostic screen and retrieve the T3 and T4 information directly from the modem history. So a very useful database to build is one that correlates cable modems, RF network topology and T3 / T4 stats. This should provide a very useful tool in preventative and reactive plant maintenance. No IP involved here as you have identified!
Regards,
-Brady
[Google Translate]
I work for a major cable operator in Florida, I actually am assigned escallated calls. Lately I get complaints of the cable light going off and within 10 seconds it’s back on. The funny thing is the DOCSIS monitoring software does not catch it. First I thought the customer was looking for credit, then more and more are saying the same things. Headend says all is 100% and hubs come back clean, however on the DOCSIS report it shows numerous T-3′s timouts no response. Is this plant integrity issue I am told by supervisors these are normal. Can you help with any web sites, information to help resolve this issue?
[Google Translate]
Hi Cable Tech,
It is very difficult to diagnose your problem based upon your description, but I believe you have a legitimate problem. My suspicion is that your “DOCSIS monitoring software” is either polling SNMP queries on the CMTS, which are rather infrequent and misses the modems going off line or it is polling at Layer 3 (TCP/IP) and not at the DOCSIS layer, where the T3 timeouts you are seeing are occurring. There is another section in my blog that talks about T3 and T4 timeouts specifically and how they are a result of upstream and downstream impairments because station maintenance messages are not able to get through. I recommend that you learn a little more about your DOCSIS monitoring software. This will help you understand exactly what it is telling you about your network. Once you understand what it is telling you, you will also know what it is not telling you about your network, which is much more important.
Regards,
-Brady
[Google Translate]
Hi Brady,
Sometimes in our LAB cable network (spreaded across 3 floors), almost all CM looses their IPs, and then we have to remove the one lot of 10 CM and then others get the IP and back in network. we are only using 80 CM(in fact STBs) overall.
we are not able to find out what could be problem here as there is no noise also observed from that part of the floor.
Also the SNR value of ‘US phy SNR_estimate for good packets – 27.1886 dB’ from command ‘ sh contro cab1/0 upstr 0′
goes to 9 to 11 dB.
[Google Translate]
I have a interesting issue regarding the T3 timeouts on 2 different moto modems Ive tried. Exactly at 840pm CST every night my connection and latency suffers until 4am or so (cant stay up everynight to get exact time) but connection then is perfect for the entire day. Latency and speed are well within parameters, but at 840pm my modem displays No Ranging Response received – T3 time-out. Why is that at 840 this is happening and no other time? Its getting very frustrating and having to call my cable provider every evening about this.
HELP?
Thank you much
[Google Translate]
Hi Roger,
You have what sounds like a classic case of an upstream impairment that is time-dependent. Do you by chance have any heavy equipment around you that fires up at 8:40pm? If this is a new occurrence, see if there is a streetlight that turns on at this same time. Something is likely coming on which is emitting a lot of ingress into the upstream. It is preventing your RNG-REQ from reaching the CMTS, hence your T3 Timeouts. In addition it is causing packet loss, which you are seeing as latency, because in a TCP/IP flow, the packets are just being retransmitted. Of course I am presuming your are basing the latency off of HTTP page requests or gaming results. If using VoIP or video conferencing, which is UDP-based, there is no packet re-transmission and so you would have noticeable packet loss and hence degraded service quality.
I think it is important you work with your cable operator and have them measure errored code words (measured on the CMTS) on your modem during the evening hours (8:40pm and after). This will be the only way that they will actually see the problems you are experiencing. If they test in the day time, everything will look just fine. Ideally, you may be able to identify the source of ingress and point it out to the cable operator so that they can fix the leak in their plant. This is often a difficult battle for the cable operator and everyone pays the price.
Good luck, let me know how it turns out.
-Brady
[Google Translate]
Hi Brady ,We work at HOT , a Israeli MSO and we are facing lately severe SNR problem in the field as I will describe bellow:
We have a very unstable SNR (between 18 and 36 dBc) in some of our US ports.
Our configuration US: 64 QAM TDMA/ATDMA 3.2 MHz mix mode
DS: 256 QAM 6 MHz
We checked the HFC network RF performance and it’s O.K. (C/N 35-40 dBc) No Ingress / spurious etc’.
In the specific US port we have 300 modems and 300 STBs and 30 STBs only are constantly cycling from offline to usually Init(rc).
We examine the log file from a STB which was at the subscriber home and we noticed that the STB was stuck in a cycle named
“No ranging response received – T3 time out”. The STB came back Online only after a “hard” reset (power supply disconnect/connect).
After a power resetting all the offline/init STBs registered properly and the SNR was improved dramatically and is stable to around 34-36 dBc.
Could you advice us please what is your assumption about the STB problem ?
Thank you in advance
[Google Translate]
Hi Robert and Oren,
I have two thoughts on your problem, both of which I have experienced before.
The first is one is rather common, which I suspect you have already investigated. This could be a firmware / IOS issue in the set tops or CMTS. Check with your vendor to see if this is known issue and or upgrade to the latest firmware.
The other possibility could be an issue associated with your upstream configuration. Since I don’t know the exact setup I can’t be certain, but you mentioned that you have a TDMA/A-TDMA mixed mode system. Are you using logical channels on the same frequency or two different frequencies? If logical channels, you could have the set tops with low SNR registering in A-TDMA mode when they should not be. Similarly, if you are using two different channels, you could have a plant isolation issue (typically a serious problem) where a modem transmits at a very high level to register on the wrong upstream channel. In this case the modem will have some RNG-REQ get through to the CMTS and some not, creating a perpetual init(RC) condition or if the modem is able to register because upstream losses are low enough, typically SNR levels will be poor. The SNR will be poor because the CM is transmitting backwards through a headend combiner. The best way to prevent this isolation issue from happening is to ensure that you have your plant configured so that your modems do not have enough transmit power to blast through headend combiners (i.e. they should typically be sitting around a transmit level 40-45 dBmV). This does require appropriate padding on the upstream path after the return path receiver in your headend. The reason that this would go away upon reboot would be that the modems go back and successfully register on the correct upstream channel.
Other than that, I would need to know more about your specific system in order to provide any other advice. I’m sure there are likely other possibilities, but this is what i can think of provided the information I have.
-Brady
[Google Translate]
First , Thank you for the detailed answer.
We are in the next days in holyday and we will replay next week.
Once again THANK YOU.
[Google Translate]
Hi Brady, thanks for your kindly help.
Referring your answer, we don’t work with logical channels on the same frequency. We don’t think that we should concentrate in the RF isolation problems because our SNR problem was increased in the last months with the a new type of STB instoled.
Do you think that a hardware/PLL problem can cause the phenomena described?
In addition we would like to ask you please some questions that will help “understand the material”
1. There is a command named “Stop-im” in CISCO CMTS that says to specific US “stop registration new modems “. When we are operating this command the SNR became very stable – between 33-36 dB .
We want to understand if in the stop-im command the SNR calculation algorithm of the CMTS is ignoring the unregistered modems or the unregistered modems actually stop to transmit at the specific US and because of that the US SNR improves.
Can you see any insight between the stop-im and our SNR problem?
2. In CISCO CMTS there is a command that presents the SNR per modem in a specific US (show cable modem phy).
Can you advice please how the CMTS calculates this SNR?
Is it for the last second/minute ? is it a long time cumulative SNR or instantaneous SNR ?
Do you have experience with correlating “per modem SNR” to “whole US SNR ”
Thanks in advance
[Google Translate]
i love you for this information, everytime i ask about T3 and T4 timeouts people tend to say you have no business in the modem config page LOL
[Google Translate]
Hi,
I work for a major cable provider as a plant technician and had a question maybe someone that is not in this ongoing discussion may have a reasonable point of view. My issue is rogue/babbeling modems causing all others to drop offline intermittently until we can find and manually reboot or disconect said modem. Its not modem related we have multiple manufacturers. The issue is only observed in our hubs that uses cisco 10k devices. Our motorola bsr/ubr’s never have hese. The rogue modem typically just goes in and out of ranging and offline mode causing the cmts to drop all other commands. I am kinda of searching without much luck that is this a cisco issue or is our engineering department setting up the cmts in a manner that can cause this. I can give some info on what the modem is typically showing when in this babbeling status. Typical ranging r2 to offline so on and so forth. the rx-p is usually -15 to -18 whick is very high. It most cases where we have a dual stack or using 2 upstream ports this modem can travel back and forth and sometimes cause or have the same cause as other rogue modems in the same upstream ports. Issue just seams to get out of hand where originally we would have 1 modem, last night we had 15 modems throwing out rangings commands for over 48 hrs. Any help or ideas on where to search would be great as this is a terrible inconvenience for our customers and all the higher ups want to do is point fingers. If you need more info let me know. Thank you
[Google Translate]
Hi Matt,
There are many variables that it is difficult for me to provide any guidance without being in your system. The one thing you mentioned that really sticks out though is the rx-p, which I interpret to be the receive power at the CMTS. If your receive power at the CMTS is -15 dBmV to -18 dBmV, that is way to low. Ideally you want it to be 0 dBmV +/- 3 dBmV for an ideal system. If your down at -15 to -18 I would say that is your problem.
-Brady
[Google Translate]
Food for thought?
Was just bouncing around doing some research on some wideband problems we’re seeing.
Is it just me are does sunrisetlecom say the exact opposite? Of course they’re saying “packet loss” you say “problem”.:-)
Above you’re saying:
T3 Timeout No RNG-RSP in 200 msec – Typically an Upstream Problem
T4 Timeout No RNG-REQ Grant MAP Received by Cable Modem in 35 sec – Typically a Downstream Problem
Stumbled across this:
http://www.sunrisetelecom.com/support/article_docsis_voip_troubleshooting.php
These common issues should be the first line of “Have we checked these parameters?” from both the RF and IP front. Additionally, technicians are often told they have T3 or T4 timeouts by backoffice specialists.
A T3 timeout usually indicates downstream packet loss and a T4 timeout usually indicates upstream packet loss — both RF impairments.
[Google Translate]
Funny how my past comes to haunt me. I wrote that article when I worked at SunriseTelecom for Broadband Gear Report (BGR). I did not edit it close enough before going to print and reversed T3 and T4 – it happens.
The description in my blog is accurate, the one on the SunriseTelecom website is inaccurate. If I still worked there I would correct it on their website. My apologies for the original error in the BGR article. I should have been more careful.
-Brady
[Google Translate]
Tech ops supervisor just trying to educate my self on modem logs and what can be looked at in certain situations..
This is prob a poor example but just a quick one off the top of my head…
When should you be concerned when looking into a modem history?…I see critical,minor,major errors on almost all reports…some modems never have an issue, some are cronic complaints..have been researching T3 T4 timeouts like its been my second job just to get a better understanding on “what” to look for to resolve it…its so confusing wether or not its our plant, house wiring, or head end.
This modem report is 5 days old since a “master reboot” after customer complainng of random, frequent loss of connection. so i Check our software programs like scout monitor etc, and dont see anything crazy except for lots of T3 timouts and lost syncs..obvoiusly the reboot cleared the old history but this is the new one…should i still be worried or did we fix the issue with this modem??? thats what i think after all theses kind of calls…
as for the example of errors, notice the 3000+critical:
(again, prob not the best example, but its been my most recient call i went out on)
DHCP WARNING – Non-critical field invalid in response. critical 1 2011-07-11 09:42:31 2011-07-11 09:42:31
Received Response to Broadcast Maintenance Request, But no Unicast Maintenance opportunities received – T4 timeout critical 1 2011-07-11 09:54:19 2011-07-11 09:54:19
No Ranging Response received – T3 time-out critical 2 2011-07-11 10:01:09 2011-07-11 10:01:08
TFTP failed – configuration file NOT FOUND critical 1 2011-07-11 10:01:19 2011-07-11 10:01:19
ToD request sent- No Response received error 2 2011-07-11 10:01:20 2011-07-11 10:01:20
SYNC Timing Synchronization failure – Failed to acquire QAMQPSK symbol timing critical 3667 2011-07-11 10:05:15 2011-07-11 09:53:45
SYNC Timing Synchronization failure – Failed to acquire FEC framing critical 4 2011-07-11 10:05:16 2011-07-11 10:03:23
DHCP WARNING – Non-critical field invalid in response. critical 1 2011-07-11 10:05:29 2011-07-11 10:05:29
Thanks in advance…also, any other good links you have on similar issues, real life events??
[Google Translate]
Hi Casey,
The modem log you have provided looks like a pretty classic example of RF plant impairments. They are likely intermittent, so when you send out a tech everything looks fine (levels, MER, etc.), but that modem is definitely having frequent problems throughout the day and the customer is feeling the pain. So why do I think this? Let me explain:
There are three main issues: T3/T4 timeouts are the first sign, which indicate likely upstream and downstream impairments. Next, you are getting SYNC timing synchronization errors (these are clock messages sent very frequently in the downstream – so there is a downstream issue for sure), finally the modem was unable to lock to the downstream QAM due to failure of finding the SYNC.
You also had TFTP failures. It is unlikely that your modem was given an incorrect TFTP file name unless your IT admin is really asleep at the wheel, but more probable that when the modem sent the TFTP request the message was garbled due to RF impairments (upstream problems).
So there are two likely locations of the problems: 1) customer’s home and 2) RF plant. Likely you have moved the CM close to the drop into the customer’s home by now. You really want nothing more than one splitter between the drop and the cable modem. So the rest is the RF plant. The log you sent me indicates that the problems are happening around 10am. Have you sent a tech to the home at that time of day? Is there heavy equipment running at that time? Does the customer do something at the same time they have data problems, such as run an electric clothes dryer which is sitting beside the modem? This should not be a problem of course if all of the RF coax cables are in good shape.
Should you worry about this? That is up to you. Do you have a telco competitor in the same area? If so, how long do you want to keep the customer? I used to have very similar looking cable modem logs. I worked with my cable operator for three years to resolve the problem. I have Series 6 coax throughout my house with compression fit connectors. Today I have U-Verse and data service that is rock solid. I wish I could have gotten the same from my cable modem.
Best of luck,
-Brady
[Google Translate]
Hi Brady,
When you say T4 timeouts can be associated with very high CMTS utilisation you mean processor utilisation on the CMTS, rather than bandwidth utilisation on the DS or US, correct?
[Google Translate]
Hi Ken,
Yes, the assumption here is that the CMTS has 200 msec to respond to the RNG-REQ. If the CMTS processesor is busy handling other tasks, it may become delayed and not process the RNG-RSP within 200 msec. Let’s say it takes 250 msec (an extra 50 msec). The cable modem is the device that is keeping track of time and therefore will throw the “T4 timeout” error log. This is not an RF problem on the downstream, but an over utilization issue.
-Brady
[Google Translate]
Thanks Brady,
What I was getting at is that downstream MAC-layer traffic should always be prioritised over layer-3 traffic, assuming of course the scheduler itself isn’t overburdened. Thanks again.
[Google Translate]
Ken,
You are absolutely correct. Most respected CMTSs with updated processing platforms and IOSs that you buy today will ensure that the RNG-RSP is prioritized in a timely manner. That being said, I have seen my share of out-dated IOSs and under-scaled processing engines running in large systems. When you look at the stats you see a long history of 99% utilization. It is a challenging discussion to have that some of the challenges in the system may not all be RF related and the fix is going to cost CAPEX money, none of which I take any commission on, BTW.
-Brady
[Google Translate]
Hi Brady.
Is it possible to block certain MAC addresses from Ranging. I have an experience that a MAC works for few minutes fine but later on it never passes through Ranging status. Is it due to cloning of MAC from different node?
[Google Translate]
Hi MZee,
There could be a number of reasons that a certain MAC address may range one time, then be rejected and fail to range later. My first recommendation is to check the CMTS flap list and see if the MAC address is on it and what information is listed on the flap list. Sometimes the problem may be simple like the modem does not have enough transmit power, which is an RF plant issue. However if you have a “cloned” modem problem, then putting the MAC address on reject list will also impact the paying subscriber. A better method is to use EAE (Early Authentication Encryption) on DOCSIS 3.0 systems in combination with Fraud Prevention modules added to your provisioning system.
-Brady
[Google Translate]