NGSAppserver7 queue is full::WO0000000053095
MINOR Closed TTB-Outages
STATUS
Closed
CREATED
Sep 07, 10:50 AM (4¾ years ago)
TYPE
TalkTalk Outage
STARTED
Sep 07, 09:42 AM (4¾ years ago)
CLOSED
Sep 10, 01:10 PM (4¾ years ago)
REFERENCE
37188 / INC13478037
INFORMATION
  • INITIAL
    4¾ years ago

    Summary
    It has een reported that the NGSAppserver7 request queue has filled up. This has impacted all OSS systems as they can no longer send logging messages to NGSAppserver7. This will also have affected resellers abilities to send us commands.
     
    N/A
                              

  • UPDATE
    4¾ years ago

    Latest Update Investigation into the cause are ongoing with Business Applications & CGI OSS Support. As the backlog reduses on NGS App server7 the client server will then be able to pass more commands over and as such the client server queues will gradually return to BAU levels. It has been agreed that OSS will continue to monitor the queues and provide stat updates every hour until it has been confirmed that the queues are reducing as expected. Once the backlog has been cleared, clean-up activities will need to be completed by BAS/OSS Support teams to go through all order tracking and work with resellers to resubmit any commands if required.  

  • UPDATE
    4¾ years ago

    Latest Update As the backlog reduses on NGS App server7 the client server will then be able to pass more commands over and as such the client server queues will gradually return to BAU levels. It has been agreed that OSS will continue to monitor the queues and provide stat updates every hour until it has been confirmed that the queues are reducing as expected. Once the backlog has been cleared, clean-up activities will then be able to be completed by BAS/OSS Support teams to go through the order tracking data and identify and progress any impacted/duplicated orders.  

  • UPDATE
    4¾ years ago

    Latest Update As the backlog reduces on NGS App server7 the client server will then be able to pass more commands over and as such the client server queues will gradually return to BAU levels. It has been agreed that OSS will continue to monitor the queues and provide stat updates every hour until it has been confirmed that the queues are reducing as expected. Once the backlog has been cleared, clean-up activities will then be able to be completed by BAS/OSS Support teams to go through the order tracking data and identify and progress any impacted/duplicated orders.  

  • UPDATE
    4¾ years ago

    Latest Update As the backlog reduces on NGS App server7 the client server will then be able to pass more commands over and as such the client server queues will gradually return to BAU levels. It has been agreed that OSS will continue to monitor the queues and provide stat updates every 4 hours going into the evening and night until it has been confirmed that the queues are reducing as expected. Once the backlog has been cleared, clean-up activities will then be able to be completed by BAS/OSS Support teams to go through the order tracking data and identify and progress any impacted/duplicated orders.  

  • UPDATE
    4¾ years ago

    Latest Update A call has been completed and a plan for the clean-up actives confirmed between OSS/BAS and Tibco teams. This was due to start while on the call as although the queues are not currently at BAU levels it was believed that they were at a point where these tasks could begin. However while on the call resource alerts were received for the LLU Message broker on NGSAppServer7. Tibco teams have also paused LLU services which will help assist in the queues being reduced and these commands will queued until the service is switched back on. It has been decided to complete some further monitoring till 11:00hrs to allow the queues to clear down and a further call will be convened at that time to confirm the current status and if we are able to start the required clean activities for all impacted orders.  

  • UPDATE
    4¾ years ago

    Latest Update After the check point call it has been confirmed by BAS/OSS support teams that the due to the daily business volumes of BAU traffic, that combined NGSAppServer7 are still quite high. The volumes are reducing and however due to the daily volume some of the backlogged commands are still 24hrs behind. The Last stuck/errored command has been confirmed as being observed at 16:30hrs yesterday and as no others have been identified the clean-up actives for impacted orders has now started. 1) OSS have used the order tracking tool to pull a list of the impacted orders and the duplicates commands to identify the ones that are in accepted state (Completed). 2) These will then be passed onto TRIO Ops to update the accounts with the correct command ID that TRIO will track and update. 3) Once completed OSS will regenerate a new AOD which will update the correct order in in TRIO. At the same time OSS will cancel all the unrequired/duplicate orders. A further checkpoint call has been arranged for 14:00hrs to confirm the progress and any remaining steps required.  

  • UPDATE
    4¾ years ago

    Latest Update After the check point call it has been confirmed by BAS/OSS support teams that all the backlog has now cleared from the above PortalWeb server which are now BAU. There is currently approx. 350k being processed across the NGSAppServer7 queues which is believed to be completed and back to BAU after 6pm this evening. Clean-up actives for impacted orders. 1) OSS have used the order tracking tool to pull a list of the impacted orders and the duplicates commands to identify the ones that are in accepted state. (Completed). 2) These will then be passed onto TRIO Ops to update the accounts with the correct comma-nd ID that TRIO will track and update. (Completed). 3) Once completed OSS will regenerate a new AOD which will update the correct order in in TRIO. At the same time OSS will cancel all the unrequired/duplicate orders. (To be completed) BAS support team are currently working on making changes to the required tool used to generate the new AOD updates for the impacted orders. Once this has been confirmed as completed then the final step above can be progressed. It has however been advised that the tool in question is not able to be used after 17:00hrs.  

  • UPDATE
    4¾ years ago

    Latest Update The backlog of commands (New order requests, Cancel, Amends etc) from partners and Consumer have now been processed and returned to BAU levels since 16:30hrs. Some clean-up activities into impacted/duplicated orders are ongoing between OSS/BAS and TRIO support teams. Clean-up actives for impacted orders - 1) OSS have used the order tracking tool to pull a list of the impacted orders and the duplicates commands to identify the ones that are in accepted state. - (Completed). 2) These will then be passed onto TRIO Ops to update the accounts with the correct command ID that TRIO will track and update. - (Completed). 3) Once completed OSS will regenerate a new AOD which will update the correct order in in TRIO. – (Completed & being validated). 4) OSS will cancel all the unrequired/duplicate orders. - (In progress). A further checkpoint call has been arranged for 09:00hrs tomorrow to discuss with Operation and fulfilment teams on the progress of the remaining clean-up activities. Also to ensure no further outstanding issues are identified.  

  • UPDATE
    4¾ years ago

    Latest Update The backlog across all associated server have reduced to BAU levels since 16:30hrs 08/09. Clean-up actives for impacted orders - 1) OSS have used the order tracking tool to pull a list of the impacted orders and the duplicates commands to identify the ones that are in accepted state. - (Completed). 2) These will then be passed onto TRIO Ops to update the accounts with the correct command ID that TRIO will track and update. - (Completed). 3) Once completed OSS will regenerate a new AOD which will update the correct order in in TRIO. – (Completed). 4) OSS are rejecting/cancelling all the unrequired/duplicate orders. - In progress. A further checkpoint call has been arranged for 10:10hrs to discuss with Operation and fulfilment teams on the progress of the remaining clean-up activities. Also to ensure no further outstanding issues are identified.  

  • UPDATE
    4¾ years ago

    Latest Update Clean-up actives for impacted orders - TRIO/OSS and Collections teams analysing the remaining Bar/Un-Bar orders to confirm how these can be progressed. All other impacted orders are being progressed per BAU processes with Fulfilment teams.  

  • UPDATE
    4¾ years ago

    Latest Update Fulfilment teams are currently removing the required Bar commands from OSS. On completion of this Tibco Teams will then complete the same task from a TRIO perspective. A checkpoint call has been arranged for 10:30hrs to discuss the progress. This is the final remaining task and all other impacted orders have been progressed per BAU processes with Fulfilment teams.  

  • RESOLUTION
    4¾ years ago

    Technical / Suspected Root Cause This incident was caused due to resource issues with the message broker app on the NGSAppServer7. The Logging Destroyer application was started at 09:30hrs on the 07/09 by BAS teams to assist the message broker in processing the backlog of commands (New order, requests, Cancel, Amends etc). The queues were reduced to a level where the clean-up activities for any impacted orders were able to start on 08/09 at approximately 11:00hrs. Work was completed between OSS, Tibco, CRM, Fulfilment and Collections teams to progress all impacted orders into their required states. The incident will now be resolved as all systems are available and BAU, with no further issues reported or identified.  

  • Closed