aa8901e9bb
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/193497 From-SVN: r275473
2257 lines
73 KiB
Go
2257 lines
73 KiB
Go
// Copyright 2009 The Go Authors. All rights reserved.
|
|
// Use of this source code is governed by a BSD-style
|
|
// license that can be found in the LICENSE file.
|
|
|
|
// Garbage collector (GC).
|
|
//
|
|
// The GC runs concurrently with mutator threads, is type accurate (aka precise), allows multiple
|
|
// GC thread to run in parallel. It is a concurrent mark and sweep that uses a write barrier. It is
|
|
// non-generational and non-compacting. Allocation is done using size segregated per P allocation
|
|
// areas to minimize fragmentation while eliminating locks in the common case.
|
|
//
|
|
// The algorithm decomposes into several steps.
|
|
// This is a high level description of the algorithm being used. For an overview of GC a good
|
|
// place to start is Richard Jones' gchandbook.org.
|
|
//
|
|
// The algorithm's intellectual heritage includes Dijkstra's on-the-fly algorithm, see
|
|
// Edsger W. Dijkstra, Leslie Lamport, A. J. Martin, C. S. Scholten, and E. F. M. Steffens. 1978.
|
|
// On-the-fly garbage collection: an exercise in cooperation. Commun. ACM 21, 11 (November 1978),
|
|
// 966-975.
|
|
// For journal quality proofs that these steps are complete, correct, and terminate see
|
|
// Hudson, R., and Moss, J.E.B. Copying Garbage Collection without stopping the world.
|
|
// Concurrency and Computation: Practice and Experience 15(3-5), 2003.
|
|
//
|
|
// 1. GC performs sweep termination.
|
|
//
|
|
// a. Stop the world. This causes all Ps to reach a GC safe-point.
|
|
//
|
|
// b. Sweep any unswept spans. There will only be unswept spans if
|
|
// this GC cycle was forced before the expected time.
|
|
//
|
|
// 2. GC performs the mark phase.
|
|
//
|
|
// a. Prepare for the mark phase by setting gcphase to _GCmark
|
|
// (from _GCoff), enabling the write barrier, enabling mutator
|
|
// assists, and enqueueing root mark jobs. No objects may be
|
|
// scanned until all Ps have enabled the write barrier, which is
|
|
// accomplished using STW.
|
|
//
|
|
// b. Start the world. From this point, GC work is done by mark
|
|
// workers started by the scheduler and by assists performed as
|
|
// part of allocation. The write barrier shades both the
|
|
// overwritten pointer and the new pointer value for any pointer
|
|
// writes (see mbarrier.go for details). Newly allocated objects
|
|
// are immediately marked black.
|
|
//
|
|
// c. GC performs root marking jobs. This includes scanning all
|
|
// stacks, shading all globals, and shading any heap pointers in
|
|
// off-heap runtime data structures. Scanning a stack stops a
|
|
// goroutine, shades any pointers found on its stack, and then
|
|
// resumes the goroutine.
|
|
//
|
|
// d. GC drains the work queue of grey objects, scanning each grey
|
|
// object to black and shading all pointers found in the object
|
|
// (which in turn may add those pointers to the work queue).
|
|
//
|
|
// e. Because GC work is spread across local caches, GC uses a
|
|
// distributed termination algorithm to detect when there are no
|
|
// more root marking jobs or grey objects (see gcMarkDone). At this
|
|
// point, GC transitions to mark termination.
|
|
//
|
|
// 3. GC performs mark termination.
|
|
//
|
|
// a. Stop the world.
|
|
//
|
|
// b. Set gcphase to _GCmarktermination, and disable workers and
|
|
// assists.
|
|
//
|
|
// c. Perform housekeeping like flushing mcaches.
|
|
//
|
|
// 4. GC performs the sweep phase.
|
|
//
|
|
// a. Prepare for the sweep phase by setting gcphase to _GCoff,
|
|
// setting up sweep state and disabling the write barrier.
|
|
//
|
|
// b. Start the world. From this point on, newly allocated objects
|
|
// are white, and allocating sweeps spans before use if necessary.
|
|
//
|
|
// c. GC does concurrent sweeping in the background and in response
|
|
// to allocation. See description below.
|
|
//
|
|
// 5. When sufficient allocation has taken place, replay the sequence
|
|
// starting with 1 above. See discussion of GC rate below.
|
|
|
|
// Concurrent sweep.
|
|
//
|
|
// The sweep phase proceeds concurrently with normal program execution.
|
|
// The heap is swept span-by-span both lazily (when a goroutine needs another span)
|
|
// and concurrently in a background goroutine (this helps programs that are not CPU bound).
|
|
// At the end of STW mark termination all spans are marked as "needs sweeping".
|
|
//
|
|
// The background sweeper goroutine simply sweeps spans one-by-one.
|
|
//
|
|
// To avoid requesting more OS memory while there are unswept spans, when a
|
|
// goroutine needs another span, it first attempts to reclaim that much memory
|
|
// by sweeping. When a goroutine needs to allocate a new small-object span, it
|
|
// sweeps small-object spans for the same object size until it frees at least
|
|
// one object. When a goroutine needs to allocate large-object span from heap,
|
|
// it sweeps spans until it frees at least that many pages into heap. There is
|
|
// one case where this may not suffice: if a goroutine sweeps and frees two
|
|
// nonadjacent one-page spans to the heap, it will allocate a new two-page
|
|
// span, but there can still be other one-page unswept spans which could be
|
|
// combined into a two-page span.
|
|
//
|
|
// It's critical to ensure that no operations proceed on unswept spans (that would corrupt
|
|
// mark bits in GC bitmap). During GC all mcaches are flushed into the central cache,
|
|
// so they are empty. When a goroutine grabs a new span into mcache, it sweeps it.
|
|
// When a goroutine explicitly frees an object or sets a finalizer, it ensures that
|
|
// the span is swept (either by sweeping it, or by waiting for the concurrent sweep to finish).
|
|
// The finalizer goroutine is kicked off only when all spans are swept.
|
|
// When the next GC starts, it sweeps all not-yet-swept spans (if any).
|
|
|
|
// GC rate.
|
|
// Next GC is after we've allocated an extra amount of memory proportional to
|
|
// the amount already in use. The proportion is controlled by GOGC environment variable
|
|
// (100 by default). If GOGC=100 and we're using 4M, we'll GC again when we get to 8M
|
|
// (this mark is tracked in next_gc variable). This keeps the GC cost in linear
|
|
// proportion to the allocation cost. Adjusting GOGC just changes the linear constant
|
|
// (and also the amount of extra memory used).
|
|
|
|
// Oblets
|
|
//
|
|
// In order to prevent long pauses while scanning large objects and to
|
|
// improve parallelism, the garbage collector breaks up scan jobs for
|
|
// objects larger than maxObletBytes into "oblets" of at most
|
|
// maxObletBytes. When scanning encounters the beginning of a large
|
|
// object, it scans only the first oblet and enqueues the remaining
|
|
// oblets as new scan jobs.
|
|
|
|
package runtime
|
|
|
|
import (
|
|
"internal/cpu"
|
|
"runtime/internal/atomic"
|
|
"unsafe"
|
|
)
|
|
|
|
const (
|
|
_DebugGC = 0
|
|
_ConcurrentSweep = true
|
|
_FinBlockSize = 4 * 1024
|
|
|
|
// sweepMinHeapDistance is a lower bound on the heap distance
|
|
// (in bytes) reserved for concurrent sweeping between GC
|
|
// cycles.
|
|
sweepMinHeapDistance = 1024 * 1024
|
|
)
|
|
|
|
// heapminimum is the minimum heap size at which to trigger GC.
|
|
// For small heaps, this overrides the usual GOGC*live set rule.
|
|
//
|
|
// When there is a very small live set but a lot of allocation, simply
|
|
// collecting when the heap reaches GOGC*live results in many GC
|
|
// cycles and high total per-GC overhead. This minimum amortizes this
|
|
// per-GC overhead while keeping the heap reasonably small.
|
|
//
|
|
// During initialization this is set to 4MB*GOGC/100. In the case of
|
|
// GOGC==0, this will set heapminimum to 0, resulting in constant
|
|
// collection even when the heap size is small, which is useful for
|
|
// debugging.
|
|
var heapminimum uint64 = defaultHeapMinimum
|
|
|
|
// defaultHeapMinimum is the value of heapminimum for GOGC==100.
|
|
const defaultHeapMinimum = 4 << 20
|
|
|
|
// Initialized from $GOGC. GOGC=off means no GC.
|
|
var gcpercent int32
|
|
|
|
func gcinit() {
|
|
if unsafe.Sizeof(workbuf{}) != _WorkbufSize {
|
|
throw("size of Workbuf is suboptimal")
|
|
}
|
|
|
|
// No sweep on the first cycle.
|
|
mheap_.sweepdone = 1
|
|
|
|
// Set a reasonable initial GC trigger.
|
|
memstats.triggerRatio = 7 / 8.0
|
|
|
|
// Fake a heap_marked value so it looks like a trigger at
|
|
// heapminimum is the appropriate growth from heap_marked.
|
|
// This will go into computing the initial GC goal.
|
|
memstats.heap_marked = uint64(float64(heapminimum) / (1 + memstats.triggerRatio))
|
|
|
|
// Set gcpercent from the environment. This will also compute
|
|
// and set the GC trigger and goal.
|
|
_ = setGCPercent(readgogc())
|
|
|
|
work.startSema = 1
|
|
work.markDoneSema = 1
|
|
}
|
|
|
|
func readgogc() int32 {
|
|
p := gogetenv("GOGC")
|
|
if p == "off" {
|
|
return -1
|
|
}
|
|
if n, ok := atoi32(p); ok {
|
|
return n
|
|
}
|
|
return 100
|
|
}
|
|
|
|
// gcenable is called after the bulk of the runtime initialization,
|
|
// just before we're about to start letting user code run.
|
|
// It kicks off the background sweeper goroutine, the background
|
|
// scavenger goroutine, and enables GC.
|
|
func gcenable() {
|
|
// Kick off sweeping and scavenging.
|
|
c := make(chan int, 2)
|
|
expectSystemGoroutine()
|
|
go bgsweep(c)
|
|
expectSystemGoroutine()
|
|
go bgscavenge(c)
|
|
<-c
|
|
<-c
|
|
memstats.enablegc = true // now that runtime is initialized, GC is okay
|
|
}
|
|
|
|
//go:linkname setGCPercent runtime..z2fdebug.setGCPercent
|
|
func setGCPercent(in int32) (out int32) {
|
|
// Run on the system stack since we grab the heap lock.
|
|
systemstack(func() {
|
|
lock(&mheap_.lock)
|
|
out = gcpercent
|
|
if in < 0 {
|
|
in = -1
|
|
}
|
|
gcpercent = in
|
|
heapminimum = defaultHeapMinimum * uint64(gcpercent) / 100
|
|
// Update pacing in response to gcpercent change.
|
|
gcSetTriggerRatio(memstats.triggerRatio)
|
|
unlock(&mheap_.lock)
|
|
})
|
|
|
|
// If we just disabled GC, wait for any concurrent GC mark to
|
|
// finish so we always return with no GC running.
|
|
if in < 0 {
|
|
gcWaitOnMark(atomic.Load(&work.cycles))
|
|
}
|
|
|
|
return out
|
|
}
|
|
|
|
// Garbage collector phase.
|
|
// Indicates to write barrier and synchronization task to perform.
|
|
var gcphase uint32
|
|
|
|
// The compiler knows about this variable.
|
|
// If you change it, you must change gofrontend/wb.cc, too.
|
|
// If you change the first four bytes, you must also change the write
|
|
// barrier insertion code.
|
|
var writeBarrier struct {
|
|
enabled bool // compiler emits a check of this before calling write barrier
|
|
pad [3]byte // compiler uses 32-bit load for "enabled" field
|
|
needed bool // whether we need a write barrier for current GC phase
|
|
cgo bool // whether we need a write barrier for a cgo check
|
|
alignme uint64 // guarantee alignment so that compiler can use a 32 or 64-bit load
|
|
}
|
|
|
|
// gcBlackenEnabled is 1 if mutator assists and background mark
|
|
// workers are allowed to blacken objects. This must only be set when
|
|
// gcphase == _GCmark.
|
|
var gcBlackenEnabled uint32
|
|
|
|
const (
|
|
_GCoff = iota // GC not running; sweeping in background, write barrier disabled
|
|
_GCmark // GC marking roots and workbufs: allocate black, write barrier ENABLED
|
|
_GCmarktermination // GC mark termination: allocate black, P's help GC, write barrier ENABLED
|
|
)
|
|
|
|
//go:nosplit
|
|
func setGCPhase(x uint32) {
|
|
atomic.Store(&gcphase, x)
|
|
writeBarrier.needed = gcphase == _GCmark || gcphase == _GCmarktermination
|
|
writeBarrier.enabled = writeBarrier.needed || writeBarrier.cgo
|
|
}
|
|
|
|
// gcMarkWorkerMode represents the mode that a concurrent mark worker
|
|
// should operate in.
|
|
//
|
|
// Concurrent marking happens through four different mechanisms. One
|
|
// is mutator assists, which happen in response to allocations and are
|
|
// not scheduled. The other three are variations in the per-P mark
|
|
// workers and are distinguished by gcMarkWorkerMode.
|
|
type gcMarkWorkerMode int
|
|
|
|
const (
|
|
// gcMarkWorkerDedicatedMode indicates that the P of a mark
|
|
// worker is dedicated to running that mark worker. The mark
|
|
// worker should run without preemption.
|
|
gcMarkWorkerDedicatedMode gcMarkWorkerMode = iota
|
|
|
|
// gcMarkWorkerFractionalMode indicates that a P is currently
|
|
// running the "fractional" mark worker. The fractional worker
|
|
// is necessary when GOMAXPROCS*gcBackgroundUtilization is not
|
|
// an integer. The fractional worker should run until it is
|
|
// preempted and will be scheduled to pick up the fractional
|
|
// part of GOMAXPROCS*gcBackgroundUtilization.
|
|
gcMarkWorkerFractionalMode
|
|
|
|
// gcMarkWorkerIdleMode indicates that a P is running the mark
|
|
// worker because it has nothing else to do. The idle worker
|
|
// should run until it is preempted and account its time
|
|
// against gcController.idleMarkTime.
|
|
gcMarkWorkerIdleMode
|
|
)
|
|
|
|
// gcMarkWorkerModeStrings are the strings labels of gcMarkWorkerModes
|
|
// to use in execution traces.
|
|
var gcMarkWorkerModeStrings = [...]string{
|
|
"GC (dedicated)",
|
|
"GC (fractional)",
|
|
"GC (idle)",
|
|
}
|
|
|
|
// gcController implements the GC pacing controller that determines
|
|
// when to trigger concurrent garbage collection and how much marking
|
|
// work to do in mutator assists and background marking.
|
|
//
|
|
// It uses a feedback control algorithm to adjust the memstats.gc_trigger
|
|
// trigger based on the heap growth and GC CPU utilization each cycle.
|
|
// This algorithm optimizes for heap growth to match GOGC and for CPU
|
|
// utilization between assist and background marking to be 25% of
|
|
// GOMAXPROCS. The high-level design of this algorithm is documented
|
|
// at https://golang.org/s/go15gcpacing.
|
|
//
|
|
// All fields of gcController are used only during a single mark
|
|
// cycle.
|
|
var gcController gcControllerState
|
|
|
|
type gcControllerState struct {
|
|
// scanWork is the total scan work performed this cycle. This
|
|
// is updated atomically during the cycle. Updates occur in
|
|
// bounded batches, since it is both written and read
|
|
// throughout the cycle. At the end of the cycle, this is how
|
|
// much of the retained heap is scannable.
|
|
//
|
|
// Currently this is the bytes of heap scanned. For most uses,
|
|
// this is an opaque unit of work, but for estimation the
|
|
// definition is important.
|
|
scanWork int64
|
|
|
|
// bgScanCredit is the scan work credit accumulated by the
|
|
// concurrent background scan. This credit is accumulated by
|
|
// the background scan and stolen by mutator assists. This is
|
|
// updated atomically. Updates occur in bounded batches, since
|
|
// it is both written and read throughout the cycle.
|
|
bgScanCredit int64
|
|
|
|
// assistTime is the nanoseconds spent in mutator assists
|
|
// during this cycle. This is updated atomically. Updates
|
|
// occur in bounded batches, since it is both written and read
|
|
// throughout the cycle.
|
|
assistTime int64
|
|
|
|
// dedicatedMarkTime is the nanoseconds spent in dedicated
|
|
// mark workers during this cycle. This is updated atomically
|
|
// at the end of the concurrent mark phase.
|
|
dedicatedMarkTime int64
|
|
|
|
// fractionalMarkTime is the nanoseconds spent in the
|
|
// fractional mark worker during this cycle. This is updated
|
|
// atomically throughout the cycle and will be up-to-date if
|
|
// the fractional mark worker is not currently running.
|
|
fractionalMarkTime int64
|
|
|
|
// idleMarkTime is the nanoseconds spent in idle marking
|
|
// during this cycle. This is updated atomically throughout
|
|
// the cycle.
|
|
idleMarkTime int64
|
|
|
|
// markStartTime is the absolute start time in nanoseconds
|
|
// that assists and background mark workers started.
|
|
markStartTime int64
|
|
|
|
// dedicatedMarkWorkersNeeded is the number of dedicated mark
|
|
// workers that need to be started. This is computed at the
|
|
// beginning of each cycle and decremented atomically as
|
|
// dedicated mark workers get started.
|
|
dedicatedMarkWorkersNeeded int64
|
|
|
|
// assistWorkPerByte is the ratio of scan work to allocated
|
|
// bytes that should be performed by mutator assists. This is
|
|
// computed at the beginning of each cycle and updated every
|
|
// time heap_scan is updated.
|
|
assistWorkPerByte float64
|
|
|
|
// assistBytesPerWork is 1/assistWorkPerByte.
|
|
assistBytesPerWork float64
|
|
|
|
// fractionalUtilizationGoal is the fraction of wall clock
|
|
// time that should be spent in the fractional mark worker on
|
|
// each P that isn't running a dedicated worker.
|
|
//
|
|
// For example, if the utilization goal is 25% and there are
|
|
// no dedicated workers, this will be 0.25. If the goal is
|
|
// 25%, there is one dedicated worker, and GOMAXPROCS is 5,
|
|
// this will be 0.05 to make up the missing 5%.
|
|
//
|
|
// If this is zero, no fractional workers are needed.
|
|
fractionalUtilizationGoal float64
|
|
|
|
_ cpu.CacheLinePad
|
|
}
|
|
|
|
// startCycle resets the GC controller's state and computes estimates
|
|
// for a new GC cycle. The caller must hold worldsema.
|
|
func (c *gcControllerState) startCycle() {
|
|
c.scanWork = 0
|
|
c.bgScanCredit = 0
|
|
c.assistTime = 0
|
|
c.dedicatedMarkTime = 0
|
|
c.fractionalMarkTime = 0
|
|
c.idleMarkTime = 0
|
|
|
|
// Ensure that the heap goal is at least a little larger than
|
|
// the current live heap size. This may not be the case if GC
|
|
// start is delayed or if the allocation that pushed heap_live
|
|
// over gc_trigger is large or if the trigger is really close to
|
|
// GOGC. Assist is proportional to this distance, so enforce a
|
|
// minimum distance, even if it means going over the GOGC goal
|
|
// by a tiny bit.
|
|
if memstats.next_gc < memstats.heap_live+1024*1024 {
|
|
memstats.next_gc = memstats.heap_live + 1024*1024
|
|
}
|
|
|
|
// Compute the background mark utilization goal. In general,
|
|
// this may not come out exactly. We round the number of
|
|
// dedicated workers so that the utilization is closest to
|
|
// 25%. For small GOMAXPROCS, this would introduce too much
|
|
// error, so we add fractional workers in that case.
|
|
totalUtilizationGoal := float64(gomaxprocs) * gcBackgroundUtilization
|
|
c.dedicatedMarkWorkersNeeded = int64(totalUtilizationGoal + 0.5)
|
|
utilError := float64(c.dedicatedMarkWorkersNeeded)/totalUtilizationGoal - 1
|
|
const maxUtilError = 0.3
|
|
if utilError < -maxUtilError || utilError > maxUtilError {
|
|
// Rounding put us more than 30% off our goal. With
|
|
// gcBackgroundUtilization of 25%, this happens for
|
|
// GOMAXPROCS<=3 or GOMAXPROCS=6. Enable fractional
|
|
// workers to compensate.
|
|
if float64(c.dedicatedMarkWorkersNeeded) > totalUtilizationGoal {
|
|
// Too many dedicated workers.
|
|
c.dedicatedMarkWorkersNeeded--
|
|
}
|
|
c.fractionalUtilizationGoal = (totalUtilizationGoal - float64(c.dedicatedMarkWorkersNeeded)) / float64(gomaxprocs)
|
|
} else {
|
|
c.fractionalUtilizationGoal = 0
|
|
}
|
|
|
|
// In STW mode, we just want dedicated workers.
|
|
if debug.gcstoptheworld > 0 {
|
|
c.dedicatedMarkWorkersNeeded = int64(gomaxprocs)
|
|
c.fractionalUtilizationGoal = 0
|
|
}
|
|
|
|
// Clear per-P state
|
|
for _, p := range allp {
|
|
p.gcAssistTime = 0
|
|
p.gcFractionalMarkTime = 0
|
|
}
|
|
|
|
// Compute initial values for controls that are updated
|
|
// throughout the cycle.
|
|
c.revise()
|
|
|
|
if debug.gcpacertrace > 0 {
|
|
print("pacer: assist ratio=", c.assistWorkPerByte,
|
|
" (scan ", memstats.heap_scan>>20, " MB in ",
|
|
work.initialHeapLive>>20, "->",
|
|
memstats.next_gc>>20, " MB)",
|
|
" workers=", c.dedicatedMarkWorkersNeeded,
|
|
"+", c.fractionalUtilizationGoal, "\n")
|
|
}
|
|
}
|
|
|
|
// revise updates the assist ratio during the GC cycle to account for
|
|
// improved estimates. This should be called either under STW or
|
|
// whenever memstats.heap_scan, memstats.heap_live, or
|
|
// memstats.next_gc is updated (with mheap_.lock held).
|
|
//
|
|
// It should only be called when gcBlackenEnabled != 0 (because this
|
|
// is when assists are enabled and the necessary statistics are
|
|
// available).
|
|
func (c *gcControllerState) revise() {
|
|
gcpercent := gcpercent
|
|
if gcpercent < 0 {
|
|
// If GC is disabled but we're running a forced GC,
|
|
// act like GOGC is huge for the below calculations.
|
|
gcpercent = 100000
|
|
}
|
|
live := atomic.Load64(&memstats.heap_live)
|
|
|
|
var heapGoal, scanWorkExpected int64
|
|
if live <= memstats.next_gc {
|
|
// We're under the soft goal. Pace GC to complete at
|
|
// next_gc assuming the heap is in steady-state.
|
|
heapGoal = int64(memstats.next_gc)
|
|
|
|
// Compute the expected scan work remaining.
|
|
//
|
|
// This is estimated based on the expected
|
|
// steady-state scannable heap. For example, with
|
|
// GOGC=100, only half of the scannable heap is
|
|
// expected to be live, so that's what we target.
|
|
//
|
|
// (This is a float calculation to avoid overflowing on
|
|
// 100*heap_scan.)
|
|
scanWorkExpected = int64(float64(memstats.heap_scan) * 100 / float64(100+gcpercent))
|
|
} else {
|
|
// We're past the soft goal. Pace GC so that in the
|
|
// worst case it will complete by the hard goal.
|
|
const maxOvershoot = 1.1
|
|
heapGoal = int64(float64(memstats.next_gc) * maxOvershoot)
|
|
|
|
// Compute the upper bound on the scan work remaining.
|
|
scanWorkExpected = int64(memstats.heap_scan)
|
|
}
|
|
|
|
// Compute the remaining scan work estimate.
|
|
//
|
|
// Note that we currently count allocations during GC as both
|
|
// scannable heap (heap_scan) and scan work completed
|
|
// (scanWork), so allocation will change this difference will
|
|
// slowly in the soft regime and not at all in the hard
|
|
// regime.
|
|
scanWorkRemaining := scanWorkExpected - c.scanWork
|
|
if scanWorkRemaining < 1000 {
|
|
// We set a somewhat arbitrary lower bound on
|
|
// remaining scan work since if we aim a little high,
|
|
// we can miss by a little.
|
|
//
|
|
// We *do* need to enforce that this is at least 1,
|
|
// since marking is racy and double-scanning objects
|
|
// may legitimately make the remaining scan work
|
|
// negative, even in the hard goal regime.
|
|
scanWorkRemaining = 1000
|
|
}
|
|
|
|
// Compute the heap distance remaining.
|
|
heapRemaining := heapGoal - int64(live)
|
|
if heapRemaining <= 0 {
|
|
// This shouldn't happen, but if it does, avoid
|
|
// dividing by zero or setting the assist negative.
|
|
heapRemaining = 1
|
|
}
|
|
|
|
// Compute the mutator assist ratio so by the time the mutator
|
|
// allocates the remaining heap bytes up to next_gc, it will
|
|
// have done (or stolen) the remaining amount of scan work.
|
|
c.assistWorkPerByte = float64(scanWorkRemaining) / float64(heapRemaining)
|
|
c.assistBytesPerWork = float64(heapRemaining) / float64(scanWorkRemaining)
|
|
}
|
|
|
|
// endCycle computes the trigger ratio for the next cycle.
|
|
func (c *gcControllerState) endCycle() float64 {
|
|
if work.userForced {
|
|
// Forced GC means this cycle didn't start at the
|
|
// trigger, so where it finished isn't good
|
|
// information about how to adjust the trigger.
|
|
// Just leave it where it is.
|
|
return memstats.triggerRatio
|
|
}
|
|
|
|
// Proportional response gain for the trigger controller. Must
|
|
// be in [0, 1]. Lower values smooth out transient effects but
|
|
// take longer to respond to phase changes. Higher values
|
|
// react to phase changes quickly, but are more affected by
|
|
// transient changes. Values near 1 may be unstable.
|
|
const triggerGain = 0.5
|
|
|
|
// Compute next cycle trigger ratio. First, this computes the
|
|
// "error" for this cycle; that is, how far off the trigger
|
|
// was from what it should have been, accounting for both heap
|
|
// growth and GC CPU utilization. We compute the actual heap
|
|
// growth during this cycle and scale that by how far off from
|
|
// the goal CPU utilization we were (to estimate the heap
|
|
// growth if we had the desired CPU utilization). The
|
|
// difference between this estimate and the GOGC-based goal
|
|
// heap growth is the error.
|
|
goalGrowthRatio := gcEffectiveGrowthRatio()
|
|
actualGrowthRatio := float64(memstats.heap_live)/float64(memstats.heap_marked) - 1
|
|
assistDuration := nanotime() - c.markStartTime
|
|
|
|
// Assume background mark hit its utilization goal.
|
|
utilization := gcBackgroundUtilization
|
|
// Add assist utilization; avoid divide by zero.
|
|
if assistDuration > 0 {
|
|
utilization += float64(c.assistTime) / float64(assistDuration*int64(gomaxprocs))
|
|
}
|
|
|
|
triggerError := goalGrowthRatio - memstats.triggerRatio - utilization/gcGoalUtilization*(actualGrowthRatio-memstats.triggerRatio)
|
|
|
|
// Finally, we adjust the trigger for next time by this error,
|
|
// damped by the proportional gain.
|
|
triggerRatio := memstats.triggerRatio + triggerGain*triggerError
|
|
|
|
if debug.gcpacertrace > 0 {
|
|
// Print controller state in terms of the design
|
|
// document.
|
|
H_m_prev := memstats.heap_marked
|
|
h_t := memstats.triggerRatio
|
|
H_T := memstats.gc_trigger
|
|
h_a := actualGrowthRatio
|
|
H_a := memstats.heap_live
|
|
h_g := goalGrowthRatio
|
|
H_g := int64(float64(H_m_prev) * (1 + h_g))
|
|
u_a := utilization
|
|
u_g := gcGoalUtilization
|
|
W_a := c.scanWork
|
|
print("pacer: H_m_prev=", H_m_prev,
|
|
" h_t=", h_t, " H_T=", H_T,
|
|
" h_a=", h_a, " H_a=", H_a,
|
|
" h_g=", h_g, " H_g=", H_g,
|
|
" u_a=", u_a, " u_g=", u_g,
|
|
" W_a=", W_a,
|
|
" goalΔ=", goalGrowthRatio-h_t,
|
|
" actualΔ=", h_a-h_t,
|
|
" u_a/u_g=", u_a/u_g,
|
|
"\n")
|
|
}
|
|
|
|
return triggerRatio
|
|
}
|
|
|
|
// enlistWorker encourages another dedicated mark worker to start on
|
|
// another P if there are spare worker slots. It is used by putfull
|
|
// when more work is made available.
|
|
//
|
|
//go:nowritebarrier
|
|
func (c *gcControllerState) enlistWorker() {
|
|
// If there are idle Ps, wake one so it will run an idle worker.
|
|
// NOTE: This is suspected of causing deadlocks. See golang.org/issue/19112.
|
|
//
|
|
// if atomic.Load(&sched.npidle) != 0 && atomic.Load(&sched.nmspinning) == 0 {
|
|
// wakep()
|
|
// return
|
|
// }
|
|
|
|
// There are no idle Ps. If we need more dedicated workers,
|
|
// try to preempt a running P so it will switch to a worker.
|
|
if c.dedicatedMarkWorkersNeeded <= 0 {
|
|
return
|
|
}
|
|
// Pick a random other P to preempt.
|
|
if gomaxprocs <= 1 {
|
|
return
|
|
}
|
|
gp := getg()
|
|
if gp == nil || gp.m == nil || gp.m.p == 0 {
|
|
return
|
|
}
|
|
myID := gp.m.p.ptr().id
|
|
for tries := 0; tries < 5; tries++ {
|
|
id := int32(fastrandn(uint32(gomaxprocs - 1)))
|
|
if id >= myID {
|
|
id++
|
|
}
|
|
p := allp[id]
|
|
if p.status != _Prunning {
|
|
continue
|
|
}
|
|
if preemptone(p) {
|
|
return
|
|
}
|
|
}
|
|
}
|
|
|
|
// findRunnableGCWorker returns the background mark worker for _p_ if it
|
|
// should be run. This must only be called when gcBlackenEnabled != 0.
|
|
func (c *gcControllerState) findRunnableGCWorker(_p_ *p) *g {
|
|
if gcBlackenEnabled == 0 {
|
|
throw("gcControllerState.findRunnable: blackening not enabled")
|
|
}
|
|
if _p_.gcBgMarkWorker == 0 {
|
|
// The mark worker associated with this P is blocked
|
|
// performing a mark transition. We can't run it
|
|
// because it may be on some other run or wait queue.
|
|
return nil
|
|
}
|
|
|
|
if !gcMarkWorkAvailable(_p_) {
|
|
// No work to be done right now. This can happen at
|
|
// the end of the mark phase when there are still
|
|
// assists tapering off. Don't bother running a worker
|
|
// now because it'll just return immediately.
|
|
return nil
|
|
}
|
|
|
|
decIfPositive := func(ptr *int64) bool {
|
|
if *ptr > 0 {
|
|
if atomic.Xaddint64(ptr, -1) >= 0 {
|
|
return true
|
|
}
|
|
// We lost a race
|
|
atomic.Xaddint64(ptr, +1)
|
|
}
|
|
return false
|
|
}
|
|
|
|
if decIfPositive(&c.dedicatedMarkWorkersNeeded) {
|
|
// This P is now dedicated to marking until the end of
|
|
// the concurrent mark phase.
|
|
_p_.gcMarkWorkerMode = gcMarkWorkerDedicatedMode
|
|
} else if c.fractionalUtilizationGoal == 0 {
|
|
// No need for fractional workers.
|
|
return nil
|
|
} else {
|
|
// Is this P behind on the fractional utilization
|
|
// goal?
|
|
//
|
|
// This should be kept in sync with pollFractionalWorkerExit.
|
|
delta := nanotime() - gcController.markStartTime
|
|
if delta > 0 && float64(_p_.gcFractionalMarkTime)/float64(delta) > c.fractionalUtilizationGoal {
|
|
// Nope. No need to run a fractional worker.
|
|
return nil
|
|
}
|
|
// Run a fractional worker.
|
|
_p_.gcMarkWorkerMode = gcMarkWorkerFractionalMode
|
|
}
|
|
|
|
// Run the background mark worker
|
|
gp := _p_.gcBgMarkWorker.ptr()
|
|
casgstatus(gp, _Gwaiting, _Grunnable)
|
|
if trace.enabled {
|
|
traceGoUnpark(gp, 0)
|
|
}
|
|
return gp
|
|
}
|
|
|
|
// pollFractionalWorkerExit reports whether a fractional mark worker
|
|
// should self-preempt. It assumes it is called from the fractional
|
|
// worker.
|
|
func pollFractionalWorkerExit() bool {
|
|
// This should be kept in sync with the fractional worker
|
|
// scheduler logic in findRunnableGCWorker.
|
|
now := nanotime()
|
|
delta := now - gcController.markStartTime
|
|
if delta <= 0 {
|
|
return true
|
|
}
|
|
p := getg().m.p.ptr()
|
|
selfTime := p.gcFractionalMarkTime + (now - p.gcMarkWorkerStartTime)
|
|
// Add some slack to the utilization goal so that the
|
|
// fractional worker isn't behind again the instant it exits.
|
|
return float64(selfTime)/float64(delta) > 1.2*gcController.fractionalUtilizationGoal
|
|
}
|
|
|
|
// gcSetTriggerRatio sets the trigger ratio and updates everything
|
|
// derived from it: the absolute trigger, the heap goal, mark pacing,
|
|
// and sweep pacing.
|
|
//
|
|
// This can be called any time. If GC is the in the middle of a
|
|
// concurrent phase, it will adjust the pacing of that phase.
|
|
//
|
|
// This depends on gcpercent, memstats.heap_marked, and
|
|
// memstats.heap_live. These must be up to date.
|
|
//
|
|
// mheap_.lock must be held or the world must be stopped.
|
|
func gcSetTriggerRatio(triggerRatio float64) {
|
|
// Compute the next GC goal, which is when the allocated heap
|
|
// has grown by GOGC/100 over the heap marked by the last
|
|
// cycle.
|
|
goal := ^uint64(0)
|
|
if gcpercent >= 0 {
|
|
goal = memstats.heap_marked + memstats.heap_marked*uint64(gcpercent)/100
|
|
}
|
|
|
|
// Set the trigger ratio, capped to reasonable bounds.
|
|
if triggerRatio < 0 {
|
|
// This can happen if the mutator is allocating very
|
|
// quickly or the GC is scanning very slowly.
|
|
triggerRatio = 0
|
|
} else if gcpercent >= 0 {
|
|
// Ensure there's always a little margin so that the
|
|
// mutator assist ratio isn't infinity.
|
|
maxTriggerRatio := 0.95 * float64(gcpercent) / 100
|
|
if triggerRatio > maxTriggerRatio {
|
|
triggerRatio = maxTriggerRatio
|
|
}
|
|
}
|
|
memstats.triggerRatio = triggerRatio
|
|
|
|
// Compute the absolute GC trigger from the trigger ratio.
|
|
//
|
|
// We trigger the next GC cycle when the allocated heap has
|
|
// grown by the trigger ratio over the marked heap size.
|
|
trigger := ^uint64(0)
|
|
if gcpercent >= 0 {
|
|
trigger = uint64(float64(memstats.heap_marked) * (1 + triggerRatio))
|
|
// Don't trigger below the minimum heap size.
|
|
minTrigger := heapminimum
|
|
if !isSweepDone() {
|
|
// Concurrent sweep happens in the heap growth
|
|
// from heap_live to gc_trigger, so ensure
|
|
// that concurrent sweep has some heap growth
|
|
// in which to perform sweeping before we
|
|
// start the next GC cycle.
|
|
sweepMin := atomic.Load64(&memstats.heap_live) + sweepMinHeapDistance
|
|
if sweepMin > minTrigger {
|
|
minTrigger = sweepMin
|
|
}
|
|
}
|
|
if trigger < minTrigger {
|
|
trigger = minTrigger
|
|
}
|
|
if int64(trigger) < 0 {
|
|
print("runtime: next_gc=", memstats.next_gc, " heap_marked=", memstats.heap_marked, " heap_live=", memstats.heap_live, " initialHeapLive=", work.initialHeapLive, "triggerRatio=", triggerRatio, " minTrigger=", minTrigger, "\n")
|
|
throw("gc_trigger underflow")
|
|
}
|
|
if trigger > goal {
|
|
// The trigger ratio is always less than GOGC/100, but
|
|
// other bounds on the trigger may have raised it.
|
|
// Push up the goal, too.
|
|
goal = trigger
|
|
}
|
|
}
|
|
|
|
// Commit to the trigger and goal.
|
|
memstats.gc_trigger = trigger
|
|
memstats.next_gc = goal
|
|
if trace.enabled {
|
|
traceNextGC()
|
|
}
|
|
|
|
// Update mark pacing.
|
|
if gcphase != _GCoff {
|
|
gcController.revise()
|
|
}
|
|
|
|
// Update sweep pacing.
|
|
if isSweepDone() {
|
|
mheap_.sweepPagesPerByte = 0
|
|
} else {
|
|
// Concurrent sweep needs to sweep all of the in-use
|
|
// pages by the time the allocated heap reaches the GC
|
|
// trigger. Compute the ratio of in-use pages to sweep
|
|
// per byte allocated, accounting for the fact that
|
|
// some might already be swept.
|
|
heapLiveBasis := atomic.Load64(&memstats.heap_live)
|
|
heapDistance := int64(trigger) - int64(heapLiveBasis)
|
|
// Add a little margin so rounding errors and
|
|
// concurrent sweep are less likely to leave pages
|
|
// unswept when GC starts.
|
|
heapDistance -= 1024 * 1024
|
|
if heapDistance < _PageSize {
|
|
// Avoid setting the sweep ratio extremely high
|
|
heapDistance = _PageSize
|
|
}
|
|
pagesSwept := atomic.Load64(&mheap_.pagesSwept)
|
|
sweepDistancePages := int64(mheap_.pagesInUse) - int64(pagesSwept)
|
|
if sweepDistancePages <= 0 {
|
|
mheap_.sweepPagesPerByte = 0
|
|
} else {
|
|
mheap_.sweepPagesPerByte = float64(sweepDistancePages) / float64(heapDistance)
|
|
mheap_.sweepHeapLiveBasis = heapLiveBasis
|
|
// Write pagesSweptBasis last, since this
|
|
// signals concurrent sweeps to recompute
|
|
// their debt.
|
|
atomic.Store64(&mheap_.pagesSweptBasis, pagesSwept)
|
|
}
|
|
}
|
|
|
|
gcPaceScavenger()
|
|
}
|
|
|
|
// gcEffectiveGrowthRatio returns the current effective heap growth
|
|
// ratio (GOGC/100) based on heap_marked from the previous GC and
|
|
// next_gc for the current GC.
|
|
//
|
|
// This may differ from gcpercent/100 because of various upper and
|
|
// lower bounds on gcpercent. For example, if the heap is smaller than
|
|
// heapminimum, this can be higher than gcpercent/100.
|
|
//
|
|
// mheap_.lock must be held or the world must be stopped.
|
|
func gcEffectiveGrowthRatio() float64 {
|
|
egogc := float64(memstats.next_gc-memstats.heap_marked) / float64(memstats.heap_marked)
|
|
if egogc < 0 {
|
|
// Shouldn't happen, but just in case.
|
|
egogc = 0
|
|
}
|
|
return egogc
|
|
}
|
|
|
|
// gcGoalUtilization is the goal CPU utilization for
|
|
// marking as a fraction of GOMAXPROCS.
|
|
const gcGoalUtilization = 0.30
|
|
|
|
// gcBackgroundUtilization is the fixed CPU utilization for background
|
|
// marking. It must be <= gcGoalUtilization. The difference between
|
|
// gcGoalUtilization and gcBackgroundUtilization will be made up by
|
|
// mark assists. The scheduler will aim to use within 50% of this
|
|
// goal.
|
|
//
|
|
// Setting this to < gcGoalUtilization avoids saturating the trigger
|
|
// feedback controller when there are no assists, which allows it to
|
|
// better control CPU and heap growth. However, the larger the gap,
|
|
// the more mutator assists are expected to happen, which impact
|
|
// mutator latency.
|
|
const gcBackgroundUtilization = 0.25
|
|
|
|
// gcCreditSlack is the amount of scan work credit that can
|
|
// accumulate locally before updating gcController.scanWork and,
|
|
// optionally, gcController.bgScanCredit. Lower values give a more
|
|
// accurate assist ratio and make it more likely that assists will
|
|
// successfully steal background credit. Higher values reduce memory
|
|
// contention.
|
|
const gcCreditSlack = 2000
|
|
|
|
// gcAssistTimeSlack is the nanoseconds of mutator assist time that
|
|
// can accumulate on a P before updating gcController.assistTime.
|
|
const gcAssistTimeSlack = 5000
|
|
|
|
// gcOverAssistWork determines how many extra units of scan work a GC
|
|
// assist does when an assist happens. This amortizes the cost of an
|
|
// assist by pre-paying for this many bytes of future allocations.
|
|
const gcOverAssistWork = 64 << 10
|
|
|
|
var work struct {
|
|
full lfstack // lock-free list of full blocks workbuf
|
|
empty lfstack // lock-free list of empty blocks workbuf
|
|
pad0 cpu.CacheLinePad // prevents false-sharing between full/empty and nproc/nwait
|
|
|
|
wbufSpans struct {
|
|
lock mutex
|
|
// free is a list of spans dedicated to workbufs, but
|
|
// that don't currently contain any workbufs.
|
|
free mSpanList
|
|
// busy is a list of all spans containing workbufs on
|
|
// one of the workbuf lists.
|
|
busy mSpanList
|
|
}
|
|
|
|
// Restore 64-bit alignment on 32-bit.
|
|
_ uint32
|
|
|
|
// bytesMarked is the number of bytes marked this cycle. This
|
|
// includes bytes blackened in scanned objects, noscan objects
|
|
// that go straight to black, and permagrey objects scanned by
|
|
// markroot during the concurrent scan phase. This is updated
|
|
// atomically during the cycle. Updates may be batched
|
|
// arbitrarily, since the value is only read at the end of the
|
|
// cycle.
|
|
//
|
|
// Because of benign races during marking, this number may not
|
|
// be the exact number of marked bytes, but it should be very
|
|
// close.
|
|
//
|
|
// Put this field here because it needs 64-bit atomic access
|
|
// (and thus 8-byte alignment even on 32-bit architectures).
|
|
bytesMarked uint64
|
|
|
|
markrootNext uint32 // next markroot job
|
|
markrootJobs uint32 // number of markroot jobs
|
|
|
|
nproc uint32
|
|
tstart int64
|
|
nwait uint32
|
|
ndone uint32
|
|
|
|
// Number of roots of various root types. Set by gcMarkRootPrepare.
|
|
nFlushCacheRoots int
|
|
nDataRoots, nSpanRoots, nStackRoots int
|
|
|
|
// Each type of GC state transition is protected by a lock.
|
|
// Since multiple threads can simultaneously detect the state
|
|
// transition condition, any thread that detects a transition
|
|
// condition must acquire the appropriate transition lock,
|
|
// re-check the transition condition and return if it no
|
|
// longer holds or perform the transition if it does.
|
|
// Likewise, any transition must invalidate the transition
|
|
// condition before releasing the lock. This ensures that each
|
|
// transition is performed by exactly one thread and threads
|
|
// that need the transition to happen block until it has
|
|
// happened.
|
|
//
|
|
// startSema protects the transition from "off" to mark or
|
|
// mark termination.
|
|
startSema uint32
|
|
// markDoneSema protects transitions from mark to mark termination.
|
|
markDoneSema uint32
|
|
|
|
bgMarkReady note // signal background mark worker has started
|
|
bgMarkDone uint32 // cas to 1 when at a background mark completion point
|
|
// Background mark completion signaling
|
|
|
|
// mode is the concurrency mode of the current GC cycle.
|
|
mode gcMode
|
|
|
|
// userForced indicates the current GC cycle was forced by an
|
|
// explicit user call.
|
|
userForced bool
|
|
|
|
// totaltime is the CPU nanoseconds spent in GC since the
|
|
// program started if debug.gctrace > 0.
|
|
totaltime int64
|
|
|
|
// initialHeapLive is the value of memstats.heap_live at the
|
|
// beginning of this GC cycle.
|
|
initialHeapLive uint64
|
|
|
|
// assistQueue is a queue of assists that are blocked because
|
|
// there was neither enough credit to steal or enough work to
|
|
// do.
|
|
assistQueue struct {
|
|
lock mutex
|
|
q gQueue
|
|
}
|
|
|
|
// sweepWaiters is a list of blocked goroutines to wake when
|
|
// we transition from mark termination to sweep.
|
|
sweepWaiters struct {
|
|
lock mutex
|
|
list gList
|
|
}
|
|
|
|
// cycles is the number of completed GC cycles, where a GC
|
|
// cycle is sweep termination, mark, mark termination, and
|
|
// sweep. This differs from memstats.numgc, which is
|
|
// incremented at mark termination.
|
|
cycles uint32
|
|
|
|
// Timing/utilization stats for this cycle.
|
|
stwprocs, maxprocs int32
|
|
tSweepTerm, tMark, tMarkTerm, tEnd int64 // nanotime() of phase start
|
|
|
|
pauseNS int64 // total STW time this cycle
|
|
pauseStart int64 // nanotime() of last STW
|
|
|
|
// debug.gctrace heap sizes for this cycle.
|
|
heap0, heap1, heap2, heapGoal uint64
|
|
}
|
|
|
|
// GC runs a garbage collection and blocks the caller until the
|
|
// garbage collection is complete. It may also block the entire
|
|
// program.
|
|
func GC() {
|
|
// We consider a cycle to be: sweep termination, mark, mark
|
|
// termination, and sweep. This function shouldn't return
|
|
// until a full cycle has been completed, from beginning to
|
|
// end. Hence, we always want to finish up the current cycle
|
|
// and start a new one. That means:
|
|
//
|
|
// 1. In sweep termination, mark, or mark termination of cycle
|
|
// N, wait until mark termination N completes and transitions
|
|
// to sweep N.
|
|
//
|
|
// 2. In sweep N, help with sweep N.
|
|
//
|
|
// At this point we can begin a full cycle N+1.
|
|
//
|
|
// 3. Trigger cycle N+1 by starting sweep termination N+1.
|
|
//
|
|
// 4. Wait for mark termination N+1 to complete.
|
|
//
|
|
// 5. Help with sweep N+1 until it's done.
|
|
//
|
|
// This all has to be written to deal with the fact that the
|
|
// GC may move ahead on its own. For example, when we block
|
|
// until mark termination N, we may wake up in cycle N+2.
|
|
|
|
// Wait until the current sweep termination, mark, and mark
|
|
// termination complete.
|
|
n := atomic.Load(&work.cycles)
|
|
gcWaitOnMark(n)
|
|
|
|
// We're now in sweep N or later. Trigger GC cycle N+1, which
|
|
// will first finish sweep N if necessary and then enter sweep
|
|
// termination N+1.
|
|
gcStart(gcTrigger{kind: gcTriggerCycle, n: n + 1})
|
|
|
|
// Wait for mark termination N+1 to complete.
|
|
gcWaitOnMark(n + 1)
|
|
|
|
// Finish sweep N+1 before returning. We do this both to
|
|
// complete the cycle and because runtime.GC() is often used
|
|
// as part of tests and benchmarks to get the system into a
|
|
// relatively stable and isolated state.
|
|
for atomic.Load(&work.cycles) == n+1 && sweepone() != ^uintptr(0) {
|
|
sweep.nbgsweep++
|
|
Gosched()
|
|
}
|
|
|
|
// Callers may assume that the heap profile reflects the
|
|
// just-completed cycle when this returns (historically this
|
|
// happened because this was a STW GC), but right now the
|
|
// profile still reflects mark termination N, not N+1.
|
|
//
|
|
// As soon as all of the sweep frees from cycle N+1 are done,
|
|
// we can go ahead and publish the heap profile.
|
|
//
|
|
// First, wait for sweeping to finish. (We know there are no
|
|
// more spans on the sweep queue, but we may be concurrently
|
|
// sweeping spans, so we have to wait.)
|
|
for atomic.Load(&work.cycles) == n+1 && atomic.Load(&mheap_.sweepers) != 0 {
|
|
Gosched()
|
|
}
|
|
|
|
// Now we're really done with sweeping, so we can publish the
|
|
// stable heap profile. Only do this if we haven't already hit
|
|
// another mark termination.
|
|
mp := acquirem()
|
|
cycle := atomic.Load(&work.cycles)
|
|
if cycle == n+1 || (gcphase == _GCmark && cycle == n+2) {
|
|
mProf_PostSweep()
|
|
}
|
|
releasem(mp)
|
|
}
|
|
|
|
// gcWaitOnMark blocks until GC finishes the Nth mark phase. If GC has
|
|
// already completed this mark phase, it returns immediately.
|
|
func gcWaitOnMark(n uint32) {
|
|
for {
|
|
// Disable phase transitions.
|
|
lock(&work.sweepWaiters.lock)
|
|
nMarks := atomic.Load(&work.cycles)
|
|
if gcphase != _GCmark {
|
|
// We've already completed this cycle's mark.
|
|
nMarks++
|
|
}
|
|
if nMarks > n {
|
|
// We're done.
|
|
unlock(&work.sweepWaiters.lock)
|
|
return
|
|
}
|
|
|
|
// Wait until sweep termination, mark, and mark
|
|
// termination of cycle N complete.
|
|
work.sweepWaiters.list.push(getg())
|
|
goparkunlock(&work.sweepWaiters.lock, waitReasonWaitForGCCycle, traceEvGoBlock, 1)
|
|
}
|
|
}
|
|
|
|
// gcMode indicates how concurrent a GC cycle should be.
|
|
type gcMode int
|
|
|
|
const (
|
|
gcBackgroundMode gcMode = iota // concurrent GC and sweep
|
|
gcForceMode // stop-the-world GC now, concurrent sweep
|
|
gcForceBlockMode // stop-the-world GC now and STW sweep (forced by user)
|
|
)
|
|
|
|
// A gcTrigger is a predicate for starting a GC cycle. Specifically,
|
|
// it is an exit condition for the _GCoff phase.
|
|
type gcTrigger struct {
|
|
kind gcTriggerKind
|
|
now int64 // gcTriggerTime: current time
|
|
n uint32 // gcTriggerCycle: cycle number to start
|
|
}
|
|
|
|
type gcTriggerKind int
|
|
|
|
const (
|
|
// gcTriggerHeap indicates that a cycle should be started when
|
|
// the heap size reaches the trigger heap size computed by the
|
|
// controller.
|
|
gcTriggerHeap gcTriggerKind = iota
|
|
|
|
// gcTriggerTime indicates that a cycle should be started when
|
|
// it's been more than forcegcperiod nanoseconds since the
|
|
// previous GC cycle.
|
|
gcTriggerTime
|
|
|
|
// gcTriggerCycle indicates that a cycle should be started if
|
|
// we have not yet started cycle number gcTrigger.n (relative
|
|
// to work.cycles).
|
|
gcTriggerCycle
|
|
)
|
|
|
|
// test reports whether the trigger condition is satisfied, meaning
|
|
// that the exit condition for the _GCoff phase has been met. The exit
|
|
// condition should be tested when allocating.
|
|
func (t gcTrigger) test() bool {
|
|
if !memstats.enablegc || panicking != 0 || gcphase != _GCoff {
|
|
return false
|
|
}
|
|
switch t.kind {
|
|
case gcTriggerHeap:
|
|
// Non-atomic access to heap_live for performance. If
|
|
// we are going to trigger on this, this thread just
|
|
// atomically wrote heap_live anyway and we'll see our
|
|
// own write.
|
|
return memstats.heap_live >= memstats.gc_trigger
|
|
case gcTriggerTime:
|
|
if gcpercent < 0 {
|
|
return false
|
|
}
|
|
lastgc := int64(atomic.Load64(&memstats.last_gc_nanotime))
|
|
return lastgc != 0 && t.now-lastgc > forcegcperiod
|
|
case gcTriggerCycle:
|
|
// t.n > work.cycles, but accounting for wraparound.
|
|
return int32(t.n-work.cycles) > 0
|
|
}
|
|
return true
|
|
}
|
|
|
|
// gcStart starts the GC. It transitions from _GCoff to _GCmark (if
|
|
// debug.gcstoptheworld == 0) or performs all of GC (if
|
|
// debug.gcstoptheworld != 0).
|
|
//
|
|
// This may return without performing this transition in some cases,
|
|
// such as when called on a system stack or with locks held.
|
|
func gcStart(trigger gcTrigger) {
|
|
// Since this is called from malloc and malloc is called in
|
|
// the guts of a number of libraries that might be holding
|
|
// locks, don't attempt to start GC in non-preemptible or
|
|
// potentially unstable situations.
|
|
mp := acquirem()
|
|
if gp := getg(); gp == mp.g0 || mp.locks > 1 || mp.preemptoff != "" {
|
|
releasem(mp)
|
|
return
|
|
}
|
|
releasem(mp)
|
|
mp = nil
|
|
|
|
// Pick up the remaining unswept/not being swept spans concurrently
|
|
//
|
|
// This shouldn't happen if we're being invoked in background
|
|
// mode since proportional sweep should have just finished
|
|
// sweeping everything, but rounding errors, etc, may leave a
|
|
// few spans unswept. In forced mode, this is necessary since
|
|
// GC can be forced at any point in the sweeping cycle.
|
|
//
|
|
// We check the transition condition continuously here in case
|
|
// this G gets delayed in to the next GC cycle.
|
|
for trigger.test() && sweepone() != ^uintptr(0) {
|
|
sweep.nbgsweep++
|
|
}
|
|
|
|
// Perform GC initialization and the sweep termination
|
|
// transition.
|
|
semacquire(&work.startSema)
|
|
// Re-check transition condition under transition lock.
|
|
if !trigger.test() {
|
|
semrelease(&work.startSema)
|
|
return
|
|
}
|
|
|
|
// For stats, check if this GC was forced by the user.
|
|
work.userForced = trigger.kind == gcTriggerCycle
|
|
|
|
// In gcstoptheworld debug mode, upgrade the mode accordingly.
|
|
// We do this after re-checking the transition condition so
|
|
// that multiple goroutines that detect the heap trigger don't
|
|
// start multiple STW GCs.
|
|
mode := gcBackgroundMode
|
|
if debug.gcstoptheworld == 1 {
|
|
mode = gcForceMode
|
|
} else if debug.gcstoptheworld == 2 {
|
|
mode = gcForceBlockMode
|
|
}
|
|
|
|
// Ok, we're doing it! Stop everybody else
|
|
semacquire(&worldsema)
|
|
|
|
if trace.enabled {
|
|
traceGCStart()
|
|
}
|
|
|
|
// Check that all Ps have finished deferred mcache flushes.
|
|
for _, p := range allp {
|
|
if fg := atomic.Load(&p.mcache.flushGen); fg != mheap_.sweepgen {
|
|
println("runtime: p", p.id, "flushGen", fg, "!= sweepgen", mheap_.sweepgen)
|
|
throw("p mcache not flushed")
|
|
}
|
|
}
|
|
|
|
gcBgMarkStartWorkers()
|
|
|
|
systemstack(gcResetMarkState)
|
|
|
|
work.stwprocs, work.maxprocs = gomaxprocs, gomaxprocs
|
|
if work.stwprocs > ncpu {
|
|
// This is used to compute CPU time of the STW phases,
|
|
// so it can't be more than ncpu, even if GOMAXPROCS is.
|
|
work.stwprocs = ncpu
|
|
}
|
|
work.heap0 = atomic.Load64(&memstats.heap_live)
|
|
work.pauseNS = 0
|
|
work.mode = mode
|
|
|
|
now := nanotime()
|
|
work.tSweepTerm = now
|
|
work.pauseStart = now
|
|
if trace.enabled {
|
|
traceGCSTWStart(1)
|
|
}
|
|
systemstack(stopTheWorldWithSema)
|
|
// Finish sweep before we start concurrent scan.
|
|
systemstack(func() {
|
|
finishsweep_m()
|
|
})
|
|
// clearpools before we start the GC. If we wait they memory will not be
|
|
// reclaimed until the next GC cycle.
|
|
clearpools()
|
|
|
|
work.cycles++
|
|
|
|
gcController.startCycle()
|
|
work.heapGoal = memstats.next_gc
|
|
|
|
// In STW mode, disable scheduling of user Gs. This may also
|
|
// disable scheduling of this goroutine, so it may block as
|
|
// soon as we start the world again.
|
|
if mode != gcBackgroundMode {
|
|
schedEnableUser(false)
|
|
}
|
|
|
|
// Enter concurrent mark phase and enable
|
|
// write barriers.
|
|
//
|
|
// Because the world is stopped, all Ps will
|
|
// observe that write barriers are enabled by
|
|
// the time we start the world and begin
|
|
// scanning.
|
|
//
|
|
// Write barriers must be enabled before assists are
|
|
// enabled because they must be enabled before
|
|
// any non-leaf heap objects are marked. Since
|
|
// allocations are blocked until assists can
|
|
// happen, we want enable assists as early as
|
|
// possible.
|
|
setGCPhase(_GCmark)
|
|
|
|
gcBgMarkPrepare() // Must happen before assist enable.
|
|
gcMarkRootPrepare()
|
|
|
|
// Mark all active tinyalloc blocks. Since we're
|
|
// allocating from these, they need to be black like
|
|
// other allocations. The alternative is to blacken
|
|
// the tiny block on every allocation from it, which
|
|
// would slow down the tiny allocator.
|
|
gcMarkTinyAllocs()
|
|
|
|
// At this point all Ps have enabled the write
|
|
// barrier, thus maintaining the no white to
|
|
// black invariant. Enable mutator assists to
|
|
// put back-pressure on fast allocating
|
|
// mutators.
|
|
atomic.Store(&gcBlackenEnabled, 1)
|
|
|
|
// Assists and workers can start the moment we start
|
|
// the world.
|
|
gcController.markStartTime = now
|
|
|
|
// Concurrent mark.
|
|
systemstack(func() {
|
|
now = startTheWorldWithSema(trace.enabled)
|
|
work.pauseNS += now - work.pauseStart
|
|
work.tMark = now
|
|
})
|
|
// In STW mode, we could block the instant systemstack
|
|
// returns, so don't do anything important here. Make sure we
|
|
// block rather than returning to user code.
|
|
if mode != gcBackgroundMode {
|
|
Gosched()
|
|
}
|
|
|
|
semrelease(&work.startSema)
|
|
}
|
|
|
|
// gcMarkDoneFlushed counts the number of P's with flushed work.
|
|
//
|
|
// Ideally this would be a captured local in gcMarkDone, but forEachP
|
|
// escapes its callback closure, so it can't capture anything.
|
|
//
|
|
// This is protected by markDoneSema.
|
|
var gcMarkDoneFlushed uint32
|
|
|
|
// debugCachedWork enables extra checks for debugging premature mark
|
|
// termination.
|
|
//
|
|
// For debugging issue #27993.
|
|
const debugCachedWork = false
|
|
|
|
// gcWorkPauseGen is for debugging the mark completion algorithm.
|
|
// gcWork put operations spin while gcWork.pauseGen == gcWorkPauseGen.
|
|
// Only used if debugCachedWork is true.
|
|
//
|
|
// For debugging issue #27993.
|
|
var gcWorkPauseGen uint32 = 1
|
|
|
|
// gcMarkDone transitions the GC from mark to mark termination if all
|
|
// reachable objects have been marked (that is, there are no grey
|
|
// objects and can be no more in the future). Otherwise, it flushes
|
|
// all local work to the global queues where it can be discovered by
|
|
// other workers.
|
|
//
|
|
// This should be called when all local mark work has been drained and
|
|
// there are no remaining workers. Specifically, when
|
|
//
|
|
// work.nwait == work.nproc && !gcMarkWorkAvailable(p)
|
|
//
|
|
// The calling context must be preemptible.
|
|
//
|
|
// Flushing local work is important because idle Ps may have local
|
|
// work queued. This is the only way to make that work visible and
|
|
// drive GC to completion.
|
|
//
|
|
// It is explicitly okay to have write barriers in this function. If
|
|
// it does transition to mark termination, then all reachable objects
|
|
// have been marked, so the write barrier cannot shade any more
|
|
// objects.
|
|
func gcMarkDone() {
|
|
// Ensure only one thread is running the ragged barrier at a
|
|
// time.
|
|
semacquire(&work.markDoneSema)
|
|
|
|
top:
|
|
// Re-check transition condition under transition lock.
|
|
//
|
|
// It's critical that this checks the global work queues are
|
|
// empty before performing the ragged barrier. Otherwise,
|
|
// there could be global work that a P could take after the P
|
|
// has passed the ragged barrier.
|
|
if !(gcphase == _GCmark && work.nwait == work.nproc && !gcMarkWorkAvailable(nil)) {
|
|
semrelease(&work.markDoneSema)
|
|
return
|
|
}
|
|
|
|
// Flush all local buffers and collect flushedWork flags.
|
|
gcMarkDoneFlushed = 0
|
|
systemstack(func() {
|
|
gp := getg().m.curg
|
|
// Mark the user stack as preemptible so that it may be scanned.
|
|
// Otherwise, our attempt to force all P's to a safepoint could
|
|
// result in a deadlock as we attempt to preempt a worker that's
|
|
// trying to preempt us (e.g. for a stack scan).
|
|
casgstatus(gp, _Grunning, _Gwaiting)
|
|
forEachP(func(_p_ *p) {
|
|
// Flush the write barrier buffer, since this may add
|
|
// work to the gcWork.
|
|
wbBufFlush1(_p_)
|
|
// For debugging, shrink the write barrier
|
|
// buffer so it flushes immediately.
|
|
// wbBuf.reset will keep it at this size as
|
|
// long as throwOnGCWork is set.
|
|
if debugCachedWork {
|
|
b := &_p_.wbBuf
|
|
b.end = uintptr(unsafe.Pointer(&b.buf[wbBufEntryPointers]))
|
|
b.debugGen = gcWorkPauseGen
|
|
}
|
|
// Flush the gcWork, since this may create global work
|
|
// and set the flushedWork flag.
|
|
//
|
|
// TODO(austin): Break up these workbufs to
|
|
// better distribute work.
|
|
_p_.gcw.dispose()
|
|
// Collect the flushedWork flag.
|
|
if _p_.gcw.flushedWork {
|
|
atomic.Xadd(&gcMarkDoneFlushed, 1)
|
|
_p_.gcw.flushedWork = false
|
|
} else if debugCachedWork {
|
|
// For debugging, freeze the gcWork
|
|
// until we know whether we've reached
|
|
// completion or not. If we think
|
|
// we've reached completion, but
|
|
// there's a paused gcWork, then
|
|
// that's a bug.
|
|
_p_.gcw.pauseGen = gcWorkPauseGen
|
|
// Capture the G's stack.
|
|
for i := range _p_.gcw.pauseStack {
|
|
_p_.gcw.pauseStack[i].pc = 0
|
|
}
|
|
callers(1, _p_.gcw.pauseStack[:])
|
|
}
|
|
})
|
|
casgstatus(gp, _Gwaiting, _Grunning)
|
|
})
|
|
|
|
if gcMarkDoneFlushed != 0 {
|
|
if debugCachedWork {
|
|
// Release paused gcWorks.
|
|
atomic.Xadd(&gcWorkPauseGen, 1)
|
|
}
|
|
// More grey objects were discovered since the
|
|
// previous termination check, so there may be more
|
|
// work to do. Keep going. It's possible the
|
|
// transition condition became true again during the
|
|
// ragged barrier, so re-check it.
|
|
goto top
|
|
}
|
|
|
|
if debugCachedWork {
|
|
throwOnGCWork = true
|
|
// Release paused gcWorks. If there are any, they
|
|
// should now observe throwOnGCWork and panic.
|
|
atomic.Xadd(&gcWorkPauseGen, 1)
|
|
}
|
|
|
|
// There was no global work, no local work, and no Ps
|
|
// communicated work since we took markDoneSema. Therefore
|
|
// there are no grey objects and no more objects can be
|
|
// shaded. Transition to mark termination.
|
|
now := nanotime()
|
|
work.tMarkTerm = now
|
|
work.pauseStart = now
|
|
getg().m.preemptoff = "gcing"
|
|
if trace.enabled {
|
|
traceGCSTWStart(0)
|
|
}
|
|
systemstack(stopTheWorldWithSema)
|
|
// The gcphase is _GCmark, it will transition to _GCmarktermination
|
|
// below. The important thing is that the wb remains active until
|
|
// all marking is complete. This includes writes made by the GC.
|
|
|
|
if debugCachedWork {
|
|
// For debugging, double check that no work was added after we
|
|
// went around above and disable write barrier buffering.
|
|
for _, p := range allp {
|
|
gcw := &p.gcw
|
|
if !gcw.empty() {
|
|
printlock()
|
|
print("runtime: P ", p.id, " flushedWork ", gcw.flushedWork)
|
|
if gcw.wbuf1 == nil {
|
|
print(" wbuf1=<nil>")
|
|
} else {
|
|
print(" wbuf1.n=", gcw.wbuf1.nobj)
|
|
}
|
|
if gcw.wbuf2 == nil {
|
|
print(" wbuf2=<nil>")
|
|
} else {
|
|
print(" wbuf2.n=", gcw.wbuf2.nobj)
|
|
}
|
|
print("\n")
|
|
if gcw.pauseGen == gcw.putGen {
|
|
println("runtime: checkPut already failed at this generation")
|
|
}
|
|
throw("throwOnGCWork")
|
|
}
|
|
}
|
|
} else {
|
|
// For unknown reasons (see issue #27993), there is
|
|
// sometimes work left over when we enter mark
|
|
// termination. Detect this and resume concurrent
|
|
// mark. This is obviously unfortunate.
|
|
//
|
|
// Switch to the system stack to call wbBufFlush1,
|
|
// though in this case it doesn't matter because we're
|
|
// non-preemptible anyway.
|
|
restart := false
|
|
systemstack(func() {
|
|
for _, p := range allp {
|
|
wbBufFlush1(p)
|
|
if !p.gcw.empty() {
|
|
restart = true
|
|
break
|
|
}
|
|
}
|
|
})
|
|
if restart {
|
|
getg().m.preemptoff = ""
|
|
systemstack(func() {
|
|
now := startTheWorldWithSema(true)
|
|
work.pauseNS += now - work.pauseStart
|
|
})
|
|
goto top
|
|
}
|
|
}
|
|
|
|
// Disable assists and background workers. We must do
|
|
// this before waking blocked assists.
|
|
atomic.Store(&gcBlackenEnabled, 0)
|
|
|
|
// Wake all blocked assists. These will run when we
|
|
// start the world again.
|
|
gcWakeAllAssists()
|
|
|
|
// Likewise, release the transition lock. Blocked
|
|
// workers and assists will run when we start the
|
|
// world again.
|
|
semrelease(&work.markDoneSema)
|
|
|
|
// In STW mode, re-enable user goroutines. These will be
|
|
// queued to run after we start the world.
|
|
schedEnableUser(true)
|
|
|
|
// endCycle depends on all gcWork cache stats being flushed.
|
|
// The termination algorithm above ensured that up to
|
|
// allocations since the ragged barrier.
|
|
nextTriggerRatio := gcController.endCycle()
|
|
|
|
// Perform mark termination. This will restart the world.
|
|
gcMarkTermination(nextTriggerRatio)
|
|
}
|
|
|
|
func gcMarkTermination(nextTriggerRatio float64) {
|
|
// World is stopped.
|
|
// Start marktermination which includes enabling the write barrier.
|
|
atomic.Store(&gcBlackenEnabled, 0)
|
|
setGCPhase(_GCmarktermination)
|
|
|
|
work.heap1 = memstats.heap_live
|
|
startTime := nanotime()
|
|
|
|
mp := acquirem()
|
|
mp.preemptoff = "gcing"
|
|
_g_ := getg()
|
|
_g_.m.traceback = 2
|
|
gp := _g_.m.curg
|
|
casgstatus(gp, _Grunning, _Gwaiting)
|
|
gp.waitreason = waitReasonGarbageCollection
|
|
|
|
// Run gc on the g0 stack. We do this so that the g stack
|
|
// we're currently running on will no longer change. Cuts
|
|
// the root set down a bit (g0 stacks are not scanned, and
|
|
// we don't need to scan gc's internal state). We also
|
|
// need to switch to g0 so we can shrink the stack.
|
|
systemstack(func() {
|
|
gcMark(startTime)
|
|
// Must return immediately.
|
|
// The outer function's stack may have moved
|
|
// during gcMark (it shrinks stacks, including the
|
|
// outer function's stack), so we must not refer
|
|
// to any of its variables. Return back to the
|
|
// non-system stack to pick up the new addresses
|
|
// before continuing.
|
|
})
|
|
|
|
systemstack(func() {
|
|
work.heap2 = work.bytesMarked
|
|
if debug.gccheckmark > 0 {
|
|
// Run a full non-parallel, stop-the-world
|
|
// mark using checkmark bits, to check that we
|
|
// didn't forget to mark anything during the
|
|
// concurrent mark process.
|
|
gcResetMarkState()
|
|
initCheckmarks()
|
|
gcw := &getg().m.p.ptr().gcw
|
|
gcDrain(gcw, 0)
|
|
wbBufFlush1(getg().m.p.ptr())
|
|
gcw.dispose()
|
|
clearCheckmarks()
|
|
}
|
|
|
|
// marking is complete so we can turn the write barrier off
|
|
setGCPhase(_GCoff)
|
|
gcSweep(work.mode)
|
|
})
|
|
|
|
_g_.m.traceback = 0
|
|
casgstatus(gp, _Gwaiting, _Grunning)
|
|
|
|
if trace.enabled {
|
|
traceGCDone()
|
|
}
|
|
|
|
// all done
|
|
mp.preemptoff = ""
|
|
|
|
if gcphase != _GCoff {
|
|
throw("gc done but gcphase != _GCoff")
|
|
}
|
|
|
|
// Update GC trigger and pacing for the next cycle.
|
|
gcSetTriggerRatio(nextTriggerRatio)
|
|
|
|
// Update timing memstats
|
|
now := nanotime()
|
|
sec, nsec, _ := time_now()
|
|
unixNow := sec*1e9 + int64(nsec)
|
|
work.pauseNS += now - work.pauseStart
|
|
work.tEnd = now
|
|
atomic.Store64(&memstats.last_gc_unix, uint64(unixNow)) // must be Unix time to make sense to user
|
|
atomic.Store64(&memstats.last_gc_nanotime, uint64(now)) // monotonic time for us
|
|
memstats.pause_ns[memstats.numgc%uint32(len(memstats.pause_ns))] = uint64(work.pauseNS)
|
|
memstats.pause_end[memstats.numgc%uint32(len(memstats.pause_end))] = uint64(unixNow)
|
|
memstats.pause_total_ns += uint64(work.pauseNS)
|
|
|
|
// Update work.totaltime.
|
|
sweepTermCpu := int64(work.stwprocs) * (work.tMark - work.tSweepTerm)
|
|
// We report idle marking time below, but omit it from the
|
|
// overall utilization here since it's "free".
|
|
markCpu := gcController.assistTime + gcController.dedicatedMarkTime + gcController.fractionalMarkTime
|
|
markTermCpu := int64(work.stwprocs) * (work.tEnd - work.tMarkTerm)
|
|
cycleCpu := sweepTermCpu + markCpu + markTermCpu
|
|
work.totaltime += cycleCpu
|
|
|
|
// Compute overall GC CPU utilization.
|
|
totalCpu := sched.totaltime + (now-sched.procresizetime)*int64(gomaxprocs)
|
|
memstats.gc_cpu_fraction = float64(work.totaltime) / float64(totalCpu)
|
|
|
|
// Reset sweep state.
|
|
sweep.nbgsweep = 0
|
|
sweep.npausesweep = 0
|
|
|
|
if work.userForced {
|
|
memstats.numforcedgc++
|
|
}
|
|
|
|
// Bump GC cycle count and wake goroutines waiting on sweep.
|
|
lock(&work.sweepWaiters.lock)
|
|
memstats.numgc++
|
|
injectglist(&work.sweepWaiters.list)
|
|
unlock(&work.sweepWaiters.lock)
|
|
|
|
// Finish the current heap profiling cycle and start a new
|
|
// heap profiling cycle. We do this before starting the world
|
|
// so events don't leak into the wrong cycle.
|
|
mProf_NextCycle()
|
|
|
|
systemstack(func() { startTheWorldWithSema(true) })
|
|
|
|
// Flush the heap profile so we can start a new cycle next GC.
|
|
// This is relatively expensive, so we don't do it with the
|
|
// world stopped.
|
|
mProf_Flush()
|
|
|
|
// Prepare workbufs for freeing by the sweeper. We do this
|
|
// asynchronously because it can take non-trivial time.
|
|
prepareFreeWorkbufs()
|
|
|
|
// Ensure all mcaches are flushed. Each P will flush its own
|
|
// mcache before allocating, but idle Ps may not. Since this
|
|
// is necessary to sweep all spans, we need to ensure all
|
|
// mcaches are flushed before we start the next GC cycle.
|
|
systemstack(func() {
|
|
forEachP(func(_p_ *p) {
|
|
_p_.mcache.prepareForSweep()
|
|
})
|
|
})
|
|
|
|
// Print gctrace before dropping worldsema. As soon as we drop
|
|
// worldsema another cycle could start and smash the stats
|
|
// we're trying to print.
|
|
if debug.gctrace > 0 {
|
|
util := int(memstats.gc_cpu_fraction * 100)
|
|
|
|
var sbuf [24]byte
|
|
printlock()
|
|
print("gc ", memstats.numgc,
|
|
" @", string(itoaDiv(sbuf[:], uint64(work.tSweepTerm-runtimeInitTime)/1e6, 3)), "s ",
|
|
util, "%: ")
|
|
prev := work.tSweepTerm
|
|
for i, ns := range []int64{work.tMark, work.tMarkTerm, work.tEnd} {
|
|
if i != 0 {
|
|
print("+")
|
|
}
|
|
print(string(fmtNSAsMS(sbuf[:], uint64(ns-prev))))
|
|
prev = ns
|
|
}
|
|
print(" ms clock, ")
|
|
for i, ns := range []int64{sweepTermCpu, gcController.assistTime, gcController.dedicatedMarkTime + gcController.fractionalMarkTime, gcController.idleMarkTime, markTermCpu} {
|
|
if i == 2 || i == 3 {
|
|
// Separate mark time components with /.
|
|
print("/")
|
|
} else if i != 0 {
|
|
print("+")
|
|
}
|
|
print(string(fmtNSAsMS(sbuf[:], uint64(ns))))
|
|
}
|
|
print(" ms cpu, ",
|
|
work.heap0>>20, "->", work.heap1>>20, "->", work.heap2>>20, " MB, ",
|
|
work.heapGoal>>20, " MB goal, ",
|
|
work.maxprocs, " P")
|
|
if work.userForced {
|
|
print(" (forced)")
|
|
}
|
|
print("\n")
|
|
printunlock()
|
|
}
|
|
|
|
semrelease(&worldsema)
|
|
// Careful: another GC cycle may start now.
|
|
|
|
releasem(mp)
|
|
mp = nil
|
|
|
|
// now that gc is done, kick off finalizer thread if needed
|
|
if !concurrentSweep {
|
|
// give the queued finalizers, if any, a chance to run
|
|
Gosched()
|
|
}
|
|
}
|
|
|
|
// gcBgMarkStartWorkers prepares background mark worker goroutines.
|
|
// These goroutines will not run until the mark phase, but they must
|
|
// be started while the work is not stopped and from a regular G
|
|
// stack. The caller must hold worldsema.
|
|
func gcBgMarkStartWorkers() {
|
|
// Background marking is performed by per-P G's. Ensure that
|
|
// each P has a background GC G.
|
|
for _, p := range allp {
|
|
if p.gcBgMarkWorker == 0 {
|
|
expectSystemGoroutine()
|
|
go gcBgMarkWorker(p)
|
|
notetsleepg(&work.bgMarkReady, -1)
|
|
noteclear(&work.bgMarkReady)
|
|
}
|
|
}
|
|
}
|
|
|
|
// gcBgMarkPrepare sets up state for background marking.
|
|
// Mutator assists must not yet be enabled.
|
|
func gcBgMarkPrepare() {
|
|
// Background marking will stop when the work queues are empty
|
|
// and there are no more workers (note that, since this is
|
|
// concurrent, this may be a transient state, but mark
|
|
// termination will clean it up). Between background workers
|
|
// and assists, we don't really know how many workers there
|
|
// will be, so we pretend to have an arbitrarily large number
|
|
// of workers, almost all of which are "waiting". While a
|
|
// worker is working it decrements nwait. If nproc == nwait,
|
|
// there are no workers.
|
|
work.nproc = ^uint32(0)
|
|
work.nwait = ^uint32(0)
|
|
}
|
|
|
|
func gcBgMarkWorker(_p_ *p) {
|
|
setSystemGoroutine()
|
|
|
|
gp := getg()
|
|
|
|
type parkInfo struct {
|
|
m muintptr // Release this m on park.
|
|
attach puintptr // If non-nil, attach to this p on park.
|
|
}
|
|
// We pass park to a gopark unlock function, so it can't be on
|
|
// the stack (see gopark). Prevent deadlock from recursively
|
|
// starting GC by disabling preemption.
|
|
gp.m.preemptoff = "GC worker init"
|
|
park := new(parkInfo)
|
|
gp.m.preemptoff = ""
|
|
|
|
park.m.set(acquirem())
|
|
park.attach.set(_p_)
|
|
// Inform gcBgMarkStartWorkers that this worker is ready.
|
|
// After this point, the background mark worker is scheduled
|
|
// cooperatively by gcController.findRunnable. Hence, it must
|
|
// never be preempted, as this would put it into _Grunnable
|
|
// and put it on a run queue. Instead, when the preempt flag
|
|
// is set, this puts itself into _Gwaiting to be woken up by
|
|
// gcController.findRunnable at the appropriate time.
|
|
notewakeup(&work.bgMarkReady)
|
|
|
|
for {
|
|
// Go to sleep until woken by gcController.findRunnable.
|
|
// We can't releasem yet since even the call to gopark
|
|
// may be preempted.
|
|
gopark(func(g *g, parkp unsafe.Pointer) bool {
|
|
park := (*parkInfo)(parkp)
|
|
|
|
// The worker G is no longer running, so it's
|
|
// now safe to allow preemption.
|
|
releasem(park.m.ptr())
|
|
|
|
// If the worker isn't attached to its P,
|
|
// attach now. During initialization and after
|
|
// a phase change, the worker may have been
|
|
// running on a different P. As soon as we
|
|
// attach, the owner P may schedule the
|
|
// worker, so this must be done after the G is
|
|
// stopped.
|
|
if park.attach != 0 {
|
|
p := park.attach.ptr()
|
|
park.attach.set(nil)
|
|
// cas the worker because we may be
|
|
// racing with a new worker starting
|
|
// on this P.
|
|
if !p.gcBgMarkWorker.cas(0, guintptr(unsafe.Pointer(g))) {
|
|
// The P got a new worker.
|
|
// Exit this worker.
|
|
return false
|
|
}
|
|
}
|
|
return true
|
|
}, unsafe.Pointer(park), waitReasonGCWorkerIdle, traceEvGoBlock, 0)
|
|
|
|
// Loop until the P dies and disassociates this
|
|
// worker (the P may later be reused, in which case
|
|
// it will get a new worker) or we failed to associate.
|
|
if _p_.gcBgMarkWorker.ptr() != gp {
|
|
break
|
|
}
|
|
|
|
// Disable preemption so we can use the gcw. If the
|
|
// scheduler wants to preempt us, we'll stop draining,
|
|
// dispose the gcw, and then preempt.
|
|
park.m.set(acquirem())
|
|
|
|
if gcBlackenEnabled == 0 {
|
|
throw("gcBgMarkWorker: blackening not enabled")
|
|
}
|
|
|
|
startTime := nanotime()
|
|
_p_.gcMarkWorkerStartTime = startTime
|
|
|
|
decnwait := atomic.Xadd(&work.nwait, -1)
|
|
if decnwait == work.nproc {
|
|
println("runtime: work.nwait=", decnwait, "work.nproc=", work.nproc)
|
|
throw("work.nwait was > work.nproc")
|
|
}
|
|
|
|
systemstack(func() {
|
|
// Mark our goroutine preemptible so its stack
|
|
// can be scanned. This lets two mark workers
|
|
// scan each other (otherwise, they would
|
|
// deadlock). We must not modify anything on
|
|
// the G stack. However, stack shrinking is
|
|
// disabled for mark workers, so it is safe to
|
|
// read from the G stack.
|
|
casgstatus(gp, _Grunning, _Gwaiting)
|
|
switch _p_.gcMarkWorkerMode {
|
|
default:
|
|
throw("gcBgMarkWorker: unexpected gcMarkWorkerMode")
|
|
case gcMarkWorkerDedicatedMode:
|
|
gcDrain(&_p_.gcw, gcDrainUntilPreempt|gcDrainFlushBgCredit)
|
|
if gp.preempt {
|
|
// We were preempted. This is
|
|
// a useful signal to kick
|
|
// everything out of the run
|
|
// queue so it can run
|
|
// somewhere else.
|
|
lock(&sched.lock)
|
|
for {
|
|
gp, _ := runqget(_p_)
|
|
if gp == nil {
|
|
break
|
|
}
|
|
globrunqput(gp)
|
|
}
|
|
unlock(&sched.lock)
|
|
}
|
|
// Go back to draining, this time
|
|
// without preemption.
|
|
gcDrain(&_p_.gcw, gcDrainFlushBgCredit)
|
|
case gcMarkWorkerFractionalMode:
|
|
gcDrain(&_p_.gcw, gcDrainFractional|gcDrainUntilPreempt|gcDrainFlushBgCredit)
|
|
case gcMarkWorkerIdleMode:
|
|
gcDrain(&_p_.gcw, gcDrainIdle|gcDrainUntilPreempt|gcDrainFlushBgCredit)
|
|
}
|
|
casgstatus(gp, _Gwaiting, _Grunning)
|
|
})
|
|
|
|
// Account for time.
|
|
duration := nanotime() - startTime
|
|
switch _p_.gcMarkWorkerMode {
|
|
case gcMarkWorkerDedicatedMode:
|
|
atomic.Xaddint64(&gcController.dedicatedMarkTime, duration)
|
|
atomic.Xaddint64(&gcController.dedicatedMarkWorkersNeeded, 1)
|
|
case gcMarkWorkerFractionalMode:
|
|
atomic.Xaddint64(&gcController.fractionalMarkTime, duration)
|
|
atomic.Xaddint64(&_p_.gcFractionalMarkTime, duration)
|
|
case gcMarkWorkerIdleMode:
|
|
atomic.Xaddint64(&gcController.idleMarkTime, duration)
|
|
}
|
|
|
|
// Was this the last worker and did we run out
|
|
// of work?
|
|
incnwait := atomic.Xadd(&work.nwait, +1)
|
|
if incnwait > work.nproc {
|
|
println("runtime: p.gcMarkWorkerMode=", _p_.gcMarkWorkerMode,
|
|
"work.nwait=", incnwait, "work.nproc=", work.nproc)
|
|
throw("work.nwait > work.nproc")
|
|
}
|
|
|
|
// If this worker reached a background mark completion
|
|
// point, signal the main GC goroutine.
|
|
if incnwait == work.nproc && !gcMarkWorkAvailable(nil) {
|
|
// Make this G preemptible and disassociate it
|
|
// as the worker for this P so
|
|
// findRunnableGCWorker doesn't try to
|
|
// schedule it.
|
|
_p_.gcBgMarkWorker.set(nil)
|
|
releasem(park.m.ptr())
|
|
|
|
gcMarkDone()
|
|
|
|
// Disable preemption and prepare to reattach
|
|
// to the P.
|
|
//
|
|
// We may be running on a different P at this
|
|
// point, so we can't reattach until this G is
|
|
// parked.
|
|
park.m.set(acquirem())
|
|
park.attach.set(_p_)
|
|
}
|
|
}
|
|
}
|
|
|
|
// gcMarkWorkAvailable reports whether executing a mark worker
|
|
// on p is potentially useful. p may be nil, in which case it only
|
|
// checks the global sources of work.
|
|
func gcMarkWorkAvailable(p *p) bool {
|
|
if p != nil && !p.gcw.empty() {
|
|
return true
|
|
}
|
|
if !work.full.empty() {
|
|
return true // global work available
|
|
}
|
|
if work.markrootNext < work.markrootJobs {
|
|
return true // root scan work available
|
|
}
|
|
return false
|
|
}
|
|
|
|
// gcMark runs the mark (or, for concurrent GC, mark termination)
|
|
// All gcWork caches must be empty.
|
|
// STW is in effect at this point.
|
|
func gcMark(start_time int64) {
|
|
if debug.allocfreetrace > 0 {
|
|
tracegc()
|
|
}
|
|
|
|
if gcphase != _GCmarktermination {
|
|
throw("in gcMark expecting to see gcphase as _GCmarktermination")
|
|
}
|
|
work.tstart = start_time
|
|
|
|
// Check that there's no marking work remaining.
|
|
if work.full != 0 || work.markrootNext < work.markrootJobs {
|
|
print("runtime: full=", hex(work.full), " next=", work.markrootNext, " jobs=", work.markrootJobs, " nDataRoots=", work.nDataRoots, " nSpanRoots=", work.nSpanRoots, " nStackRoots=", work.nStackRoots, "\n")
|
|
panic("non-empty mark queue after concurrent mark")
|
|
}
|
|
|
|
if debug.gccheckmark > 0 {
|
|
// This is expensive when there's a large number of
|
|
// Gs, so only do it if checkmark is also enabled.
|
|
gcMarkRootCheck()
|
|
}
|
|
if work.full != 0 {
|
|
throw("work.full != 0")
|
|
}
|
|
|
|
// Clear out buffers and double-check that all gcWork caches
|
|
// are empty. This should be ensured by gcMarkDone before we
|
|
// enter mark termination.
|
|
//
|
|
// TODO: We could clear out buffers just before mark if this
|
|
// has a non-negligible impact on STW time.
|
|
for _, p := range allp {
|
|
// The write barrier may have buffered pointers since
|
|
// the gcMarkDone barrier. However, since the barrier
|
|
// ensured all reachable objects were marked, all of
|
|
// these must be pointers to black objects. Hence we
|
|
// can just discard the write barrier buffer.
|
|
if debug.gccheckmark > 0 || throwOnGCWork {
|
|
// For debugging, flush the buffer and make
|
|
// sure it really was all marked.
|
|
wbBufFlush1(p)
|
|
} else {
|
|
p.wbBuf.reset()
|
|
}
|
|
|
|
gcw := &p.gcw
|
|
if !gcw.empty() {
|
|
printlock()
|
|
print("runtime: P ", p.id, " flushedWork ", gcw.flushedWork)
|
|
if gcw.wbuf1 == nil {
|
|
print(" wbuf1=<nil>")
|
|
} else {
|
|
print(" wbuf1.n=", gcw.wbuf1.nobj)
|
|
}
|
|
if gcw.wbuf2 == nil {
|
|
print(" wbuf2=<nil>")
|
|
} else {
|
|
print(" wbuf2.n=", gcw.wbuf2.nobj)
|
|
}
|
|
print("\n")
|
|
throw("P has cached GC work at end of mark termination")
|
|
}
|
|
// There may still be cached empty buffers, which we
|
|
// need to flush since we're going to free them. Also,
|
|
// there may be non-zero stats because we allocated
|
|
// black after the gcMarkDone barrier.
|
|
gcw.dispose()
|
|
}
|
|
|
|
throwOnGCWork = false
|
|
|
|
cachestats()
|
|
|
|
// Update the marked heap stat.
|
|
memstats.heap_marked = work.bytesMarked
|
|
|
|
// Update other GC heap size stats. This must happen after
|
|
// cachestats (which flushes local statistics to these) and
|
|
// flushallmcaches (which modifies heap_live).
|
|
memstats.heap_live = work.bytesMarked
|
|
memstats.heap_scan = uint64(gcController.scanWork)
|
|
|
|
if trace.enabled {
|
|
traceHeapAlloc()
|
|
}
|
|
}
|
|
|
|
// gcSweep must be called on the system stack because it acquires the heap
|
|
// lock. See mheap for details.
|
|
//go:systemstack
|
|
func gcSweep(mode gcMode) {
|
|
if gcphase != _GCoff {
|
|
throw("gcSweep being done but phase is not GCoff")
|
|
}
|
|
|
|
lock(&mheap_.lock)
|
|
mheap_.sweepgen += 2
|
|
mheap_.sweepdone = 0
|
|
if mheap_.sweepSpans[mheap_.sweepgen/2%2].index != 0 {
|
|
// We should have drained this list during the last
|
|
// sweep phase. We certainly need to start this phase
|
|
// with an empty swept list.
|
|
throw("non-empty swept list")
|
|
}
|
|
mheap_.pagesSwept = 0
|
|
mheap_.sweepArenas = mheap_.allArenas
|
|
mheap_.reclaimIndex = 0
|
|
mheap_.reclaimCredit = 0
|
|
unlock(&mheap_.lock)
|
|
|
|
if !_ConcurrentSweep || mode == gcForceBlockMode {
|
|
// Special case synchronous sweep.
|
|
// Record that no proportional sweeping has to happen.
|
|
lock(&mheap_.lock)
|
|
mheap_.sweepPagesPerByte = 0
|
|
unlock(&mheap_.lock)
|
|
// Sweep all spans eagerly.
|
|
for sweepone() != ^uintptr(0) {
|
|
sweep.npausesweep++
|
|
}
|
|
// Free workbufs eagerly.
|
|
prepareFreeWorkbufs()
|
|
for freeSomeWbufs(false) {
|
|
}
|
|
// All "free" events for this mark/sweep cycle have
|
|
// now happened, so we can make this profile cycle
|
|
// available immediately.
|
|
mProf_NextCycle()
|
|
mProf_Flush()
|
|
return
|
|
}
|
|
|
|
// Background sweep.
|
|
lock(&sweep.lock)
|
|
if sweep.parked {
|
|
sweep.parked = false
|
|
ready(sweep.g, 0, true)
|
|
}
|
|
unlock(&sweep.lock)
|
|
}
|
|
|
|
// gcResetMarkState resets global state prior to marking (concurrent
|
|
// or STW) and resets the stack scan state of all Gs.
|
|
//
|
|
// This is safe to do without the world stopped because any Gs created
|
|
// during or after this will start out in the reset state.
|
|
//
|
|
// gcResetMarkState must be called on the system stack because it acquires
|
|
// the heap lock. See mheap for details.
|
|
//
|
|
//go:systemstack
|
|
func gcResetMarkState() {
|
|
// This may be called during a concurrent phase, so make sure
|
|
// allgs doesn't change.
|
|
lock(&allglock)
|
|
for _, gp := range allgs {
|
|
gp.gcscandone = false // set to true in gcphasework
|
|
gp.gcscanvalid = false // stack has not been scanned
|
|
gp.gcAssistBytes = 0
|
|
}
|
|
unlock(&allglock)
|
|
|
|
// Clear page marks. This is just 1MB per 64GB of heap, so the
|
|
// time here is pretty trivial.
|
|
lock(&mheap_.lock)
|
|
arenas := mheap_.allArenas
|
|
unlock(&mheap_.lock)
|
|
for _, ai := range arenas {
|
|
ha := mheap_.arenas[ai.l1()][ai.l2()]
|
|
for i := range ha.pageMarks {
|
|
ha.pageMarks[i] = 0
|
|
}
|
|
}
|
|
|
|
work.bytesMarked = 0
|
|
work.initialHeapLive = atomic.Load64(&memstats.heap_live)
|
|
}
|
|
|
|
// Hooks for other packages
|
|
|
|
var poolcleanup func()
|
|
|
|
//go:linkname sync_runtime_registerPoolCleanup sync.runtime_registerPoolCleanup
|
|
func sync_runtime_registerPoolCleanup(f func()) {
|
|
poolcleanup = f
|
|
}
|
|
|
|
func clearpools() {
|
|
// clear sync.Pools
|
|
if poolcleanup != nil {
|
|
poolcleanup()
|
|
}
|
|
|
|
// Clear central sudog cache.
|
|
// Leave per-P caches alone, they have strictly bounded size.
|
|
// Disconnect cached list before dropping it on the floor,
|
|
// so that a dangling ref to one entry does not pin all of them.
|
|
lock(&sched.sudoglock)
|
|
var sg, sgnext *sudog
|
|
for sg = sched.sudogcache; sg != nil; sg = sgnext {
|
|
sgnext = sg.next
|
|
sg.next = nil
|
|
}
|
|
sched.sudogcache = nil
|
|
unlock(&sched.sudoglock)
|
|
|
|
// Clear central defer pools.
|
|
// Leave per-P pools alone, they have strictly bounded size.
|
|
lock(&sched.deferlock)
|
|
// disconnect cached list before dropping it on the floor,
|
|
// so that a dangling ref to one entry does not pin all of them.
|
|
var d, dlink *_defer
|
|
for d = sched.deferpool; d != nil; d = dlink {
|
|
dlink = d.link
|
|
d.link = nil
|
|
}
|
|
sched.deferpool = nil
|
|
unlock(&sched.deferlock)
|
|
}
|
|
|
|
// Timing
|
|
|
|
// itoaDiv formats val/(10**dec) into buf.
|
|
func itoaDiv(buf []byte, val uint64, dec int) []byte {
|
|
i := len(buf) - 1
|
|
idec := i - dec
|
|
for val >= 10 || i >= idec {
|
|
buf[i] = byte(val%10 + '0')
|
|
i--
|
|
if i == idec {
|
|
buf[i] = '.'
|
|
i--
|
|
}
|
|
val /= 10
|
|
}
|
|
buf[i] = byte(val + '0')
|
|
return buf[i:]
|
|
}
|
|
|
|
// fmtNSAsMS nicely formats ns nanoseconds as milliseconds.
|
|
func fmtNSAsMS(buf []byte, ns uint64) []byte {
|
|
if ns >= 10e6 {
|
|
// Format as whole milliseconds.
|
|
return itoaDiv(buf, ns/1e6, 0)
|
|
}
|
|
// Format two digits of precision, with at most three decimal places.
|
|
x := ns / 1e3
|
|
if x == 0 {
|
|
buf[0] = '0'
|
|
return buf[:1]
|
|
}
|
|
dec := 3
|
|
for x >= 100 {
|
|
x /= 10
|
|
dec--
|
|
}
|
|
return itoaDiv(buf, x, dec)
|
|
}
|