The Notebook Review forums were hosted by TechTarget, who shut down them down on January 31, 2022. This static read-only archive was pulled by NBR forum users between January 20 and January 31, 2022, in an effort to make sure that the valuable technical information that had been posted on the forums is preserved. For current discussions, many NBR forum users moved over to NotebookTalk.net after the shutdown.
Problems? See this thread at archive.org.

    GPUs & IGP/NB desoldering themselves

    Discussion in 'Hardware Components and Aftermarket Upgrades' started by s_rouault, May 29, 2011.

  1. s_rouault

    s_rouault Notebook Enthusiast

    Reputations:
    0
    Messages:
    11
    Likes Received:
    0
    Trophy Points:
    5
    Who knows about this topic? I've heard explanation that range from too much heat causing it to the combination of lead free solder and too many thermal cycles being the problem.

    The lead free solder blamers advise having the chip 'reballed' with leaded solder, and the other group claims simply improving the cooling is the right solution. Perhaps it's some of both?

    If I have to replace the motherboard again I am better off buying another laptop, but is there any way to be sure another laptop will be any better?

    Ripping a laptop apart every so often to heat the igp/nb or gpu seems like a bad solution to me...
     
  2. Tsunade_Hime

    Tsunade_Hime such bacon. wow

    Reputations:
    5,413
    Messages:
    10,711
    Likes Received:
    1,204
    Trophy Points:
    581
    Are you referring to the Nvidia defective chips?

    All G84M/G86M/G72M based Nvidia chips use cheapo substrate material so constant heating and cooling of the GPU will cause it to expand and contract so the solder joints fail and the GPU gets disconnected from the motherboard resulting in the loss of video/artifacts. There is no permanent fix, and I would advise against fixing it. Rebaking the motherboard to reflow the solder joints could work as little as 2 days and as long as 2+ years. You can reball the GPU all you want but it will fail eventually down the road. I would invest in a laptop without those cores. AFAIK all Nvidia chips after the G84M/G86M/G72M shouldn't have any further issues.
     
  3. s_rouault

    s_rouault Notebook Enthusiast

    Reputations:
    0
    Messages:
    11
    Likes Received:
    0
    Trophy Points:
    5
    I've heard of the Nvidia defect. The flip chip (glassy die part) actually disconnects from the package (the part that connects to the motherboard). But I'm talking strictly about the connection between a chip's package and the motherboard.

    If only the connection between the die and the package is failing, and that is only a specific Nvidia problem, why do AMD chips exhibit the same probems, and why do the same solutions work?

    This happens with consoles too.. but for some reason I've never heard of it happening to desktop videocards.
     
  4. jeremyshaw

    jeremyshaw Big time Idiot

    Reputations:
    791
    Messages:
    3,210
    Likes Received:
    231
    Trophy Points:
    131
    it was a move to a different type of solder prompted by new EU regulations, which in the name of simiplifying design (won't get too far - however, different solders for the underfill will require slightly different die designs), new solder was pushed in without fully testing it. The most infamous example of this was the xbox360 - which was fully developed in a 1.5year span, and final dies only comming out less than 2 months before launch.

    ATi chips are less known to have such problems from this time era because:

    at that time, ATi was taking a beating vs nVidia in performance, so their chips were rare seen in laptops.

    laptops with ATi chips normally used their lower end variants, which didn't put out as much heat as their larger cousins.
     
  5. s_rouault

    s_rouault Notebook Enthusiast

    Reputations:
    0
    Messages:
    11
    Likes Received:
    0
    Trophy Points:
    5
    I've heard about the solder change being the cuplrit before, but that doesn't explain why desktop grapics cards are unaffected. And what about CPUs?

    My understanding of the Nvidia suit is that ony a small percentage of their chips were affected because the mismatched materials, not because they choose the wrong one. Supposedly that was a small isnstance attributed only to Nvidia mobile chips.

    Are the arguments about the ball grid array failing a myth, and failure should always be attributed to the die to package connection? If this is true why are there still so many failures?

    To make my frustration more simple: I don't want to buy a laptop that lives past a year warranty just to weigh between buying another one that might fail in another year, putting it in the oven in hopes it will 'fix' for another year, or paying some guy who just bought a machine to bake it himeself for some unreasonable fee.

    What's further annoying is no one really knows why. It happens to console old a new, it happens to laptops old and new, but it doesnt happen to CPUs or desktop based GPUs. Why these specific chips?

    I guess the best solution is to pay the extra money for a really long warranty and forget about it.
     
  6. Peon

    Peon Notebook Virtuoso

    Reputations:
    406
    Messages:
    2,007
    Likes Received:
    128
    Trophy Points:
    81
    If by "small percentage" you mean "every single chip ever manufactured within the Geforce 8M series"...
     
  7. s_rouault

    s_rouault Notebook Enthusiast

    Reputations:
    0
    Messages:
    11
    Likes Received:
    0
    Trophy Points:
    5
    In general I mean, ie out of all the chips nvidia has produced.
     
  8. Tsunade_Hime

    Tsunade_Hime such bacon. wow

    Reputations:
    5,413
    Messages:
    10,711
    Likes Received:
    1,204
    Trophy Points:
    581
    Unfortunately there really is no way to know for future reference though I sincerely hope that Nvidia learned their lessons from GeForce 7/8/9 with those defective cores. AFAIK there were no more issues with chips desoldering themselves since, but you may want to invest in a laptop with a discreet card that is removable.
     
  9. jeremyshaw

    jeremyshaw Big time Idiot

    Reputations:
    791
    Messages:
    3,210
    Likes Received:
    231
    Trophy Points:
    131
    It happenes a lot to desktop chips. Just ask around. The G80s are falling like flys at times. However, for the smaller cards, they simply have better cooling systems that palliate said issues with heat output :p Desktop cards only have to cool the GPU, and the larger cards often have IHS to keep the die on place, or have custom cooling units (imagine GTX480/GTX580 - those coolers have to dissapate 300W of energy... all on their own!!) tailored specifically for the purpose. Not to mention, desktops custom built by enthusiasts often have decent cooling systems/airflow of their own.


    Laptops, on the other hand.... even the massive Clevo & Alienware systems have teensy cooling systems that would be barely adequate for the purpose - a limitation of laptop sizes, unfortuneately. :(
     
  10. KLF

    KLF NBR Super Modernator Super Moderator

    Reputations:
    2,844
    Messages:
    2,736
    Likes Received:
    897
    Trophy Points:
    131
    Desktop side GF 7300, 7600, 8400, 8600 series break quite often. Nforce 430 motherboard chipset is also "same bad batch". All of those are close relatives to the laptop variants, all of them suffer from same problem. 8800 series GPUs are newer, I haven't seen/heard as much problems with them.

    Frequent hot-cold-hot-cold changes are contribute to quick death, desktops don't have those as much since the heatsinks are bigger and heat/cool slower than laptops.
     
  11. Tsunade_Hime

    Tsunade_Hime such bacon. wow

    Reputations:
    5,413
    Messages:
    10,711
    Likes Received:
    1,204
    Trophy Points:
    581
    G80/G92 desktop are failing in massive amounts now. Don't have to tell you how many 8800GT/8800GTX/8800GTS/9800GT I've had to replace for customer units.

    Good thing my Vostro 1500's 8600M GT is on a separate card. :D
     
  12. newsposter

    newsposter Notebook Virtuoso

    Reputations:
    801
    Messages:
    3,881
    Likes Received:
    0
    Trophy Points:
    105
    This whole topic shows you what happens when agenda-driven political animals dictate technical details.
     
  13. zippyzap

    zippyzap Notebook Consultant

    Reputations:
    159
    Messages:
    201
    Likes Received:
    1
    Trophy Points:
    30
    RoHS = Restriction of Hazardous Substances

    Where it relates to what we are talking about is basically no more lead in solder. This has the unfortunate side effect (at least initially) of making solder more brittle. Actually, "cold solder joints" were always a problem in electronics. I remember BITD when computer CRT monitors were worth enough to necessitate repair. The majority of failures, especially with cheaper models, were cold solder joints. This was merely from the heating/cooling that a monitor goes through in normal use.

    With computer chips (GPUs/chipsets especially) this was compounded by RoHS and the highly localized high temperatures. And yes, it affected more than just NVIDIA, but of course Charlie Demerjian felt obliged to create the whole "NVIDIA bumpgate" affair because of his personal hatred towards NVIDIA because NVIDIA cut him off after he broke NDA.
     
  14. oldstyle

    oldstyle Notebook Consultant

    Reputations:
    36
    Messages:
    100
    Likes Received:
    0
    Trophy Points:
    15
    I read this thread thinking it might be about Optimus. That would have been interesting. I thought that yes the hot/cold cycles could maybe create the issue we all discovered with the 8xxx series.

    Instead I get this?

    You all make good points but much is old and what is the point? If you still have an 8xxx GPU you have bigger problems.

    Was this thread about thermal fluctuations GPU/IGP/Optimus from the start or no?
     
  15. zippyzap

    zippyzap Notebook Consultant

    Reputations:
    159
    Messages:
    201
    Likes Received:
    1
    Trophy Points:
    30
    LOL, that is true.

    Who ever said anything about Optimus? You're the first to mention it in this thread, so I think you're imagining things.
     
  16. SPL15

    SPL15 Notebook Enthusiast

    Reputations:
    3
    Messages:
    18
    Likes Received:
    1
    Trophy Points:
    6
    The problem with lead free solder (which I hate) is that it requires much higher temps to flow and the fluxes used are not as effective at wetting it like with lead solder. The problem first came about in many many electronics manufacturing houses where the manufacturers didnt adjust temps or fluxes and just threw in lead free solder into their PCB assembly machines. This lead to cold solder joints where the solder didnt have flow characteristics or the temp to etch the copper traces and the bond was merely a mechanical adhesion instead of molecularly bonded.

    A cold solder joint is one where there was insufficient heat & wetting action to melt a thin layer of the copper trace/lead to create a super thin molecular layer of copper/solder alloy that interfaces the joint. Over time and heat cycling, a cold solder joint just breaks free because the bond was purely mechanical where the two metals were never molecularly joined. Lead free solder is also harder than lead solder so even when things are molecularly joined, there is less flex and give so the connection can shear in a shorter amount of cycles and less dissimilar expansion coefficients

    In OLD electronics with failed lead solder joints that were once good joints, the heat expansion and contraction from excessive current or improper heatsinking through the joint cause the crystaline structure of the solder to gradually break down causing a higher resistance and more heat dissipation through the joint and thus turn brown from heat & oxidation and eventually shear from the connection to the component lead. The funny thing is that the joint never fails on the PCB trace side of the connection (Unless it's the solder pad ripping away from the connecting trace). The failure of the actual solder joint is always the component lead side because the expansion coefficient is much less in the component lead and the bonding surface area is also less. Also the PCB trace copper melts more and creates a better molecular bond to the solder than the component lead which is heatsinked through the component and thus the lead stays cooler so the copper/solder alloy layer is thinner and more abrupt in its boundaries. a larger boundary from the copper/solder alloy to solder kind of "bridges" the two different expansion coefficients of the solder and whatever it is attached to. Instead of hard boundary of differing expansion coefficients, there is a slightly more gradual transition due to the copper/solder alloy layer.

    I can't tell you how many times I've had to deal with lead free solder BS in my work since 2006. With BGA, if the manufacturer doesnt x-ray each part to make sure there is a good connection, there's gonna be LOTS of failures if their line isnt kept to 110% perfect working order and tolerance. I've been to China many times and close manufacturing tolerance in their line in respect to temps is not happening unless you sit there and babysit them.