# HOWTO: FreeBSD CPU Scaling and Power Saving



## vermaden (Nov 16, 2008)

For those who do not know FreeBSD is able to scale CPU speed (both desktop and mobile onesm thy just nned to support it and have enabled it in BIOS).

To enable that feature you need to add this line to *[font="Courier New"]/etc/rc.conf[/font]*:

```
powerd_enable="YES"
```

You can also tweak how much you CPU will scale depends on the load, for example:

```
powerd_flags="-i 85 -r 60 -p 100"
```

*powerd* by default use adaptive mode (thanks to *BSDKaffee*)

You can also tweak lowest CPU frequency used by CPU by setting this in *[font="Courier New"]/etc/sysctl.conf[/font]* or *[font="Courier New"]/boot/loader.conf[/font]*:

```
debug.cpufreq.lowest=600
```

You can also set it by hand in terminal using *[font="Courier New"]sysctl[/font]*:

```
sysctl debug.cpufreq.lowest=600
```

Up to yesterday there was no option to set highest value to limit max CPU speed to save power or limit overheat, but *Boris Kochergin* wrote a patch to support also the highest limit with *[font="Courier New"]debug.cpufreq.highest[/font]* oid:

```
sysctl debug.cpufreq.highest=1200
```

These patches are for *[font="Courier New"]7.0-RELEASE[/font]* and *[font="Courier New"]7-STABLE[/font]* (I did not checked *[font="Courier New"]8-CURRENT[/font]* but propably also works):

*[font="Courier New"]/usr/src/sys/kern/kern_cpu.c[/font]* (driver):

```
--- kern_cpu.c.orig	2008-11-08 13:12:24.000000000 -0500
+++ kern_cpu.c	2008-11-08 10:33:18.000000000 -0500
@@ -131,12 +131,16 @@
 DRIVER_MODULE(cpufreq, cpu, cpufreq_driver, cpufreq_dc, 0, 0);
 
 static int		cf_lowest_freq;
+static int		cf_highest_freq;
 static int		cf_verbose;
 TUNABLE_INT("debug.cpufreq.lowest", &cf_lowest_freq);
+TUNABLE_INT("debug.cpufreq.highest", &cf_highest_freq);
 TUNABLE_INT("debug.cpufreq.verbose", &cf_verbose);
 SYSCTL_NODE(_debug, OID_AUTO, cpufreq, CTLFLAG_RD, NULL, "cpufreq debugging");
 SYSCTL_INT(_debug_cpufreq, OID_AUTO, lowest, CTLFLAG_RW, &cf_lowest_freq, 1,
     "Don't provide levels below this frequency.");
+SYSCTL_INT(_debug_cpufreq, OID_AUTO, highest, CTLFLAG_RW, &cf_highest_freq, 1,
+    "Don't provide levels above this frequency.");
 SYSCTL_INT(_debug_cpufreq, OID_AUTO, verbose, CTLFLAG_RW, &cf_verbose, 1,
     "Print verbose debugging messages");
 
@@ -295,6 +299,14 @@
 		goto out;
 	}
 
+	/* Reject levels that are above our specified threshold. */
+	if (cf_highest_freq > 0 && level->total_set.freq > cf_highest_freq) {
+		CF_DEBUG("rejecting freq %d, greater than %d limit\n",
+		    level->total_set.freq, cf_highest_freq);
+		error = EINVAL;
+		goto out;
+	}
+
 	/* If already at this level, just return. */
 	if (CPUFREQ_CMP(sc->curr_level.total_set.freq, level->total_set.freq)) {
 		CF_DEBUG("skipping freq %d, same as current level %d\n",
@@ -617,8 +629,13 @@
 			continue;
 		}
 
-		/* Skip levels that have a frequency that is too low. */
-		if (lev->total_set.freq < cf_lowest_freq) {
+		/*
+		 * Skip levels that have a frequency that is too low or too
+		 * high.
+		 */
+		if (lev->total_set.freq < cf_lowest_freq ||
+		    (cf_highest_freq > 0 &&
+		     lev->total_set.freq > cf_highest_freq)) {
 			sc->all_count--;
 			continue;
 		}
```

*[font="Courier New"]/usr/src/share/man/man4/cpufreq.4[/font]* (man page):

```
--- cpufreq.4.orig	2008-11-08 13:08:19.000000000 -0500
+++ cpufreq.4	2008-11-08 13:08:51.000000000 -0500
@@ -98,6 +98,11 @@
 This setting is also accessible via a tunable with the same name.
 This can be used to disable very low levels that may be unusable on
 some systems.
+.It Va debug.cpufreq.highest
+Highest CPU frequency in MHz to offer to users.
+This setting is also accessible via a tunable with the same name.
+This can be used to disable very high levels that may be unusable on
+some systems.
 .It Va debug.cpufreq.verbose
 Print verbose messages.
 This setting is also accessible via a tunable with the same name.
```

Apply them like that:

```
# cd /usr/src/share/man/man4
# patch < /path/to/cpufreq.4.patch
# 
# cd /usr/src/sys/kern
# patch < /path/to/kern_cpu.c
```

Then rebuild kernel and reboot to use it.

This *[font="Courier New"]/usr/src/share/man/man4/cpufreq.4[/font]* is just a manpage so its not mandatory to apply/rebuid it.

Abialable CPU frequencies are aviable via [font="Courier New"]*dev.cpu.0.freq_levels*[/font] oid, example:

```
# sysctl dev.cpu.0.freq_levels 
dev.cpu.0.freq_levels: 1200/13000 1050/11375 900/9750 750/8125 600/6500
```

You can also set Cx sleep state for your CPUs with *[font="Courier New"]dev.cpu.1.cx_lowest[/font]* and *[font="Courier New"]dev.cpu.0.cx_lowest[/font]* and so per CPU.

You can change them that:

```
# sysctl dev.cpu.0.cx_lowest=C3
dev.cpu.1.cx_lowest: C1 -> C3
```

*WARN:* Dunno for other laptops but when I use lowest C3 step (or deeper like C4, C5, ...) for all cores, then I have little lag when I use my touchpad, this can be easily eliminated when you set one of the CPUs to C2 and all other to C3 to save power, no lag with that settings.

List of supported states are avialable via these oids:

```
dev.cpu.0.cx_supported: C1/1 C2/1 C3/57
dev.cpu.1.cx_supported: C1/1 C2/1 C3/57
```

Suggested setting (one with C2 state, other as deep as possible) in *[font="Courier New"]/etc/sysctl.conf[/font]*:

```
dev.cpu.0.cx_lowest=C3
dev.cpu.1.cx_lowest=C2
```

You can read more about Intel C power states here:
http://software.intel.com/en-us/blogs/2008/03/27/update-c-states-c-states-and-even-more-c-states/
http://www.techarp.com/showarticle.aspx?artno=420&pgno=6

I measured power consumption of my CPU which is Intel T7300 (in my Dell D630) under full load*[1], by a small device called wattmeter, it is connected like that:


```
power (in the wall) <--> wattmeter <--> laptop (without batteries)
```

Here are the results:

```
MHz    system power consumption (whole laptop)
 150    22W
 300    22W
 450    23W
 600    23W
 750    24W
 900    25W
1050    26W
[U]1200    27W[/U]
1400    33W
1750    42W
2000    47W
```

1200MHz seems to have best power/performance ratio and that is what I personally use.

[1] [font="Courier New"]999999999999999999999999999 ** 999999999999999999999999999;[/font] launched 4 times (to full load two cores) in *[font="Courier New"]python[/font]*.

... and by the way, setting *[font="Courier New"]kern.hz=100[/font]* in *[font="Courier New"]/boot/loader.conf[/font]* will also make your battery life little longer.

*WARN: If these options differ for AMD CPUs, then let me know, or just post them in this thread.*

If you have any questions or I forgot about something then let me know


----------



## manolis@ (Nov 16, 2008)

Thanks, this is really useful 
You may wish to modify powerd_flags to this:


```
powerd_flags="-a maximum -b adaptive -i 85 -r 60 -p 100"
```

if you are using it on a laptop (esp. a low end one, like my aspire one) so you will get maximum performance when plugged in.

I am still playing with these values myself, to get the best possible responsiveness and save on battery too. I am not there yet, but I guess I'll tweak the minimum CPU freq. to 700Mhz and will be close.


----------



## vermaden (Nov 16, 2008)

manolis@ said:
			
		

> Thanks, this is really useful



You are welcome 



			
				manolis@ said:
			
		

> You may wish to modify powerd_flags to this (...)
> if you are using it on a laptop (esp. a low end one, like my aspire one) so you will get maximum performance when plugged in.



Good point, I do not have any experience with small netbooks.

All these calculations are made on Dell D630 laptop (I mentioned T7300 CPU).


----------



## overmind (Nov 20, 2008)

Hello, Manolis@,

I see you are using a Acer Aspire One. I just bought one recently and I am trying to configure it, do you have any hints? 

I've opened a thread regarding Acer Aspire One, here: http://forums.freebsd.org/showthread.php?t=382

If you have some tips, please share it with us.
(network card, wifi, card reader, webcam, power management, optimization tips)


----------



## xwwu (Nov 30, 2008)

Dear Vermaden:

First question: 

Is


> powerd_flags="-i 85 -r 60 -p 100"


in /etc/rc.conf also?

Second:

How can I set:



> # sysctl dev.cpu.0.cx_lowest=C3
> dev.cpu.1.cx_lowest: C1 -> C3



permanently in system instead of type them in terminal?

Thanks!


----------



## danger@ (Nov 30, 2008)

xwwu said:
			
		

> permanently in system instead of type them in terminal?



add it to /etc/sysctl.conf


----------



## richardpl (Dec 1, 2008)

Right way to set cput cx states: C{1,2,3,4 ..} is via rc.conf:


```
performance_cx_lowest="HIGH"    # Online CPU idle state
performance_cpu_freq="NONE"     # Online CPU frequency
economy_cx_lowest="HIGH"        # Offline CPU idle state
economy_cpu_freq="NONE"         # Offline CPU frequency
```

In this way they are used with devd(8).
Read /etc/rc.d/power_profile for explanation.


----------



## vermaden (Dec 1, 2008)

richardpl said:
			
		

> Right way to set cput cx states: C{1,2,3,4 ..} is via rc.conf:
> 
> 
> ```
> ...



But does it allow setting different C states per CPU core?


----------



## richardpl (Dec 1, 2008)

No, but that one is not hard to fix.
Problem with setting it via sysctl.conf is that some ACPI allow C3 and lower states only when laptop is not on AC.
So once laptop is disconnected from AC CPU will be put in lower power state. Also it is not usefull to have same sysctl settings when latop is on AC and when it is on batteries.
And it is very ugly to modify cx states manually.


----------



## vermaden (Dec 1, 2008)

richardpl said:
			
		

> No, but that one is not hard to fix.



So its little useless cause setting both cores to C3 creates a big delay in touchpad getting to react, while setting one core to C2 and the other one to C3 solves taht roblem.



			
				richardpl said:
			
		

> Problem with setting it via sysctl.conf is that some ACPI allow C3 and lower states only when laptop is not on AC.
> 
> So what will happen then? It will be put into higher C state like C0 and when you remove power cord it will back to C3 for example?
> 
> ...


So develop better interface, sysctls are designed to use them, no to hide them from usage, also its done ONCE, later its just loaded at boot.

I would also want that FreeBSD would self detect best possible settings for my current laptop model, but we both know that it is impossible, so we have to set these best settings manually unfortunelly.


----------



## richardpl (Dec 1, 2008)

vermaden said:
			
		

> So its little useless cause setting both cores to C3 creates a big delay in touchpad getting to react, while setting one core to C2 and the other one to C3 solves taht roblem.


It is not useless for machines with only one cpu/core (enabled).



			
				vermaden said:
			
		

> So what will happen then? It will be put into higher C state like C0 and when you remove power cord it will back to C3 for example?


Unfortunately not, sysctl reports invalid argument and quits. But it is BIOS "fault" to not allow C3 and lower while on AC.



			
				power_profile said:
			
		

> #!/bin/sh
> #
> # Modify the power profile based on AC line state.  This script is
> # usually called from devd(8).



devd.conf(5) is "right" API to do that - not manual typing and/or sysctl.conf (which is checked almost always only once)


----------



## trev (Jan 5, 2009)

*cpufreq for Phenoms and Opterons (AMD Family 10h and 11h)*

It seems a little unbelievable that the AMD Phenom is not yet officially supported by cpufreq on FreeBSD 7.X-RELEASE, but help is at hand. 

[Note: read Edits at end of post for updates]

How to install and more:
  0. shell> /etc/rc.d/powerd stop
  1. detach the hwpstate.c file attached to this post
  2. shell> cp hwpstate.c /usr/src/sys/i386/cpufreq/
  3. edit /usr/src/sys/modules/cpufreq/Makefile and change,
     -SRCS+= est.c p4tcc.c powernow.c
     +SRCS+= est.c p4tcc.c powernow.c hwpstate.c
  4. delete the line "device cpufreq" from your KERNCONF file if 
     present and make kernel without cpufreq.
  5. shell> cd /usr/src/sys/modules/cpufreq/ && make && make install
  6. "umount -a" or "mount -u -o ro /somewhere" as possible kernel panic, and sync;sync;sync (if you're paranoid)
  7. shell> kldload cpufreq
  8. dmesg should show the verbose message "hwpstate0: <Cool`n'Quiet 2.0> on cpu0".
  9. shell> sysctl dev.cpu.0.freq_levels
 10. shell> sysctl dev.cpu.0.freq=XXXX
 11. shell> /etc/rc.d/powerd start

(Courtesy of G. Otsuji anonna2 at gmail dot com)

And a script I wrote to reduce typing 

#!/bin/sh
speed=`sysctl dev.cpu.0.freq | cut -f2 -d":"`
possibles=`sysctl dev.cpu.0.freq_levels | cut -f2 -d":" | sed "s/\/[-0-9]*/MHz/g"`

echo ""
echo "Speed: ${speed}MHz from${possibles}"
echo ""

shell>speed

Speed:  1100MHz from 2200MHz 1100MHz 

[above display for my Phenom 9550]

=================================================================

Edit 2009/02/21: Patch supplied by author against the PR version located at http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/128575

Edit 2009/05/30: See new "closed" PR version located at http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/128575 - Only incorporated in CURRENT (8.0) but also works for 7.X if you comment out hwpstate.c lines 307 and 308 being:


```
if (cpu_vendor_id != CPU_VENDOR_AMD || CPU_FAMILY(cpu_id) < 0x10)
                return;
```

as CPU_VENDOR_AMD is not defined in 7.X. This is safe enough because you _do_ have an AMD 10h (Phenom/Opteron Quad) or 11h (Phenom II?) family CPU 

I have removed the original attachments/code as a result.

Edit: 2010/01/22 The hwpstate.c file referenced above now causes the cpufreq not to load on FreeBSD 7.2-STABLE. However, if you grab the latest hwpstate.c file from http://www.freebsd.org/cgi/cvsweb.c...wpstate.c?rev=1.5.2.2;content-type=text/plain life returns to normal again


----------



## vermaden (Jan 5, 2009)

Thanks for sharing it mate.

Its also unexplainable that 7.1 RELEASE does not support that out of the box, these CPUs are around for more then a year ;/

BTW: You can simplify it this way:

```
-speed=`sysctl dev.cpu.0.freq | cut -f2 -d":"`
+speed=$( sysctl -n dev.cpu.0.freq )
```

also *hwpstate.c* for those who are NOT logged in:


```
/*-
 * Copyright (c) 2008 Gen Otsuji
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted providing that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 */

/*
 * very much thanks to Veronica(fluffles.net)
 */

/*
 * Reference:
 *  Rev 3.06  March 26, 2008 - BIOS and Kernel Developer's Guide(BKDG)
 *  for AMD Family 10h Processors
 */

#include <sys/cdefs.h>
__FBSDID("$FreeBSD$");

#include <sys/param.h>
#include <sys/bus.h>
#include <sys/cpu.h>
#include <sys/kernel.h>
#include <sys/module.h>
#include <sys/proc.h>
#include <dev/pci/pcivar.h>
#include <machine/md_var.h>

#include <contrib/dev/acpica/acpi.h>
#include <dev/acpica/acpivar.h>

#include "acpi_if.h"
#include "cpufreq_if.h"

#define MSR_AMD10H_LIMIT    0xc0010061
#define MSR_AMD10H_CONTROL  0xc0010062
#define MSR_AMD10H_STATUS   0xc0010063
#define MSR_AMD10H_CONFIG   0xc0010064
#define AMD10H_PVI_MODE     1
#define AMD10H_SVI_MODE     0
#define AMD10H_MAX_STATES   16

/* for MSR_AMD10H_LIMIT C001_0061 */
#define AMD10H_GET_PSTATE_MAX_VAL(msr)      (((msr) >> 4) & 0xF)
/* for MSR_AMD10H_CONFIG C001_0064:68 */
#define AMD10H_CUR_VID(msr)             (((msr) >> 9) & 0x3F)
#define AMD10H_CUR_DID(msr)             (((msr) >> 6) & 0x07)
#define AMD10H_CUR_FID(msr)             ((msr) & 0x3F)

/*
 * setting this to 0 can hush up verbose messages.
 */
static int hwpstate_verbose = 1;

struct hwpstate_setting {
	int freq;		/* CPU clock in Mhz or 100ths of a percent. */
	int volts;		/* Voltage in mV. */
	int power;		/* Power consumed in mW. */
	int lat;		/* Transition latency in us. */
	int pstate_id;
	device_t dev;		/* Driver providing this setting. */
};

struct hwpstate_softc {
	device_t dev;
	struct hwpstate_setting hwpstate_settings[AMD10H_MAX_STATES];
	int cfnum;
	int voltage_mode;	/* for AMD10H_PVI_MODE / AMD10H_SVI_MODE */
	int curpstate;
};

static void hwpstate_identify(driver_t * driver, device_t parent);
static int hwpstate_probe(device_t dev);
static int hwpstate_attach(device_t dev);
static int hwpstate_detach(device_t dev);
static int hwpstate_set(device_t dev, const struct cf_setting *cf);
static int hwpstate_get(device_t dev, struct cf_setting *cf);
static int hwpstate_settings(device_t dev, struct cf_setting *sets, int *count);
static int hwpstate_type(device_t dev, int *type);
static int hwpstate_shutdown(device_t dev);
static int hwpstate_features(driver_t * driver, u_int * features);

static device_method_t hwpstate_methods[] = {
	/* Device interface */
	DEVMETHOD(device_identify, hwpstate_identify),
	DEVMETHOD(device_probe, hwpstate_probe),
	DEVMETHOD(device_attach, hwpstate_attach),
	DEVMETHOD(device_detach, hwpstate_detach),
	DEVMETHOD(device_shutdown, hwpstate_shutdown),

	/* cpufreq interface */
	DEVMETHOD(cpufreq_drv_set, hwpstate_set),
	DEVMETHOD(cpufreq_drv_get, hwpstate_get),
	DEVMETHOD(cpufreq_drv_settings, hwpstate_settings),
	DEVMETHOD(cpufreq_drv_type, hwpstate_type),

	/* ACPI interface */
	DEVMETHOD(acpi_get_features, hwpstate_features),

	{0, 0}
};

static devclass_t hwpstate_devclass;
static driver_t hwpstate_driver = {
	"hwpstate",
	hwpstate_methods,
	sizeof(struct hwpstate_softc),
};
DRIVER_MODULE(hwpstate, cpu, hwpstate_driver, hwpstate_devclass, 0, 0);

static void
hwpstate_goto_pstate(device_t dev,int pstate)
{
	struct hwpstate_softc *sc;
	uint64_t msr;
	int i;
	sc = device_get_softc(dev);
	sc->curpstate = pstate;
	wrmsr(MSR_AMD10H_CONTROL, pstate);
	for(i=0;i<100;i++){
		msr=rdmsr(MSR_AMD10H_STATUS);
		if(msr==pstate){
			break;
		}
		DELAY(100);
	}
	msr=rdmsr(MSR_AMD10H_STATUS);
	if(hwpstate_verbose)
		device_printf(dev,"Now P%d-state.\n",(int)msr);
	return;
}

static int
hwpstate_set(device_t dev, const struct cf_setting *cf)
{
	struct hwpstate_softc *sc;
	struct hwpstate_setting *set;
	int i;
	if (cf == NULL)
		return (EINVAL);
	sc = device_get_softc(dev);
	set = sc->hwpstate_settings;
	for (i = 0; i < sc->cfnum; i++)
		if (cf->freq == set[i].freq)
			break;
	if (i == sc->cfnum)
		return EINVAL;
	if(hwpstate_verbose)
		device_printf(dev,"goto P%d-state\n",set[i].pstate_id);
	sc->curpstate = set[i].pstate_id;
	hwpstate_goto_pstate(dev,set[i].pstate_id);
	return (0);
}

static int
hwpstate_get(device_t dev, struct cf_setting *cf)
{
	struct hwpstate_softc *sc;
	struct hwpstate_setting set;
	sc = device_get_softc(dev);
	if (cf == NULL)
		return (EINVAL);
	set = sc->hwpstate_settings[sc->curpstate];
	cf->freq = set.freq;
	cf->volts = set.volts;
	cf->power = CPUFREQ_VAL_UNKNOWN;
	cf->lat = 16;
	cf->dev = dev;
	return (0);
}

static int
hwpstate_settings(device_t dev, struct cf_setting *sets, int *count)
{
	struct hwpstate_softc *sc;
	struct hwpstate_setting set;
	int i;
	if (sets == NULL || count == NULL)
		return (EINVAL);
	sc = device_get_softc(dev);
	if (*count < sc->cfnum)
		return (E2BIG);
	for (i = 0; i < sc->cfnum; i++, sets++) {
		set = sc->hwpstate_settings[i];
		sets->freq = set.freq;
		sets->volts = set.volts;
		sets->power = set.power;
		sets->lat = set.lat;
		sets->dev = set.dev;
	}
	*count = sc->cfnum;
	return (0);
}

static int
hwpstate_type(device_t dev, int *type)
{

	if (type == NULL)
		return (EINVAL);
	*type = CPUFREQ_TYPE_ABSOLUTE;
	return (0);
}

static int
hwpstate_is_capable(void)
{
	u_int regs[4];
	if (strcmp(cpu_vendor, "AuthenticAMD") != 0 ||
	    cpu_exthigh < 0x80000007)
		return (FALSE);
	do_cpuid(0x80000007, regs);
	if (regs[3] & 0x80) {	/* HwPstate Enable bit */
		return (TRUE);
	}
	return (FALSE);
}

static void
hwpstate_identify(driver_t * driver, device_t parent)
{
	device_t child;
	if (device_find_child(parent, "hwpstate", -1) != NULL) {
		return;
	}
	if ((child = BUS_ADD_CHILD(parent, 10, "hwpstate", -1)) == NULL)
		device_printf(parent, "hwpstate: add child failed\n");
}

static int
hwpstate_probe(device_t dev)
{
	struct hwpstate_softc *sc;
	device_t perf_dev;
	uint64_t msr;
	int error, type;
	if (resource_disabled("hwpstate", 0))
		return (ENXIO);

	/* this had not to be in hwpstate_identify() */
	if (hwpstate_is_capable() == FALSE) {
		return (ENXIO);
	}
	perf_dev = device_find_child(device_get_parent(dev), "acpi_perf", -1);
	if (perf_dev && device_is_attached(perf_dev)) {
		error = CPUFREQ_DRV_TYPE(perf_dev, &type);
		if (error == 0 && (type & CPUFREQ_FLAG_INFO_ONLY) == 0)
			return (ENXIO);
	}
	sc = device_get_softc(dev);
	switch (cpu_id) {
	case 0x100f2A:		/* family 10h rev.DR-BA */
	case 0x100f22:		/* family 10h rev.DR-B2 */
	case 0x100f23:		/* family 10h rev.DR-B3 */
		break;
	default:
		return (ENXIO);
	}
	msr = rdmsr(MSR_AMD10H_LIMIT);
	sc->cfnum = AMD10H_GET_PSTATE_MAX_VAL(msr);
	if (sc->cfnum == 0) {
		device_printf(dev, "hardware-pstate is not supported by the bios.\n");
		return ENXIO;
	}
	device_set_desc(dev, "Cool`n'Quiet 2.0");
	return (0);
}

static int
hwpstate_attach(device_t dev)
{
	struct hwpstate_softc *sc;
	struct hwpstate_setting *set;
	device_t F3;
	uint64_t msr;
	uint32_t cfg;
	int i, vid, did, fid;
	sc = device_get_softc(dev);

	/*
	 * following 24 means the 1st cpu. 25-31 instead of 24 is MP system.
	 * I don't have MP system. But only for reading from 1st cpu.
	 * so if the same 2*cpu, 4*cpu or 8*cpu, this can work, I think.
	 */
	F3 = pci_find_bsf(0, 24, 3);
	cfg = pci_read_config(F3, 0xA0, 4);
	if (cfg & 0x10) {	/* PVI mode */
		if (hwpstate_verbose)
			device_printf(dev, "PVI mode\n");
		sc->voltage_mode = AMD10H_PVI_MODE;
	} else {		/* SVI mode */
		if (hwpstate_verbose)
			device_printf(dev, "SVI mode\n");
		sc->voltage_mode = AMD10H_SVI_MODE;
	}
	msr = rdmsr(MSR_AMD10H_LIMIT);
	sc->cfnum = 1 + AMD10H_GET_PSTATE_MAX_VAL(msr);
	if (hwpstate_verbose)
		device_printf(dev, "you have %d P-state.\n", sc->cfnum);
	set = sc->hwpstate_settings;
	for (i = 0; i < sc->cfnum; i++, set++) {
		msr = rdmsr(MSR_AMD10H_CONFIG + i);
		if ((msr & 0x8000000000000000)) {
			vid = AMD10H_CUR_VID(msr);
			did = AMD10H_CUR_DID(msr);
			fid = AMD10H_CUR_FID(msr);
			set->freq = 100 * (fid + 0x10) / (1 << did);
			if (sc->voltage_mode == AMD10H_PVI_MODE) {
				/* 2.4.1.6.2 Parallel VID Encodings */
				if (vid >= 0x20)
					set->volts = (7625 - 125 * (vid - 0x20)) / 10;
				else
					set->volts = 1550 - 25 * vid;
			} else {
				/* 2.4.1.6.3 Serial VID Encodings */
				if (vid >= 0x7F)
					set->volts = 0;
				else
					set->volts = (15500 - 125 * vid) / 10;
			}
			if (hwpstate_verbose)
				device_printf(dev, "freq=%dMHz volts=%dmV\n", set->freq, set->volts);
			set->pstate_id = i;
			set->power = CPUFREQ_VAL_UNKNOWN;
			set->lat = 16;
			set->dev = dev;
		}
	}
	cpufreq_register(dev);
	hwpstate_goto_pstate(dev,0);
	return (0);
}

static int
hwpstate_detach(device_t dev)
{

	hwpstate_goto_pstate(dev,0);
	return (cpufreq_unregister(dev));
}

static int
hwpstate_shutdown(device_t dev)
{

	hwpstate_goto_pstate(dev,0);
	return (0);
}

static int
hwpstate_features(driver_t * driver, u_int * features)
{

	*features = ACPI_CAP_PERF_MSRS;
	return (0);
}
```


----------



## randux (Jan 13, 2009)

I'm having performance problems on my new 7.1 installs. How do I check what the values are? I want max. performance and I don't care about power consumption.


----------



## vermaden (Jan 13, 2009)

So disable *powerd* daemon.


----------



## randux (Jan 13, 2009)

Hi Vermaden,

Is it on by default? Is that all I have to do? I am running benchmarks now, I'll check soon. Anything else to check? The ubench numbers are great but it feels like a 486 box!


----------



## vermaden (Jan 13, 2009)

Its off by default.

Default 7.1 scheduler is suited/tuned for at least 2 cores, so if you have older 1 core CPU, then it may sometimes feel slowish at interactive tasks.


----------



## randux (Jan 13, 2009)

I think it's off by default. I will post my ubench and unixbench results in the performance thread I started.


----------



## randux (Jan 13, 2009)

We posted at the same time. No, it's a new E8400 core 2 duo box with 4g ram and it feels terrible on freebsd  Everything else runs great I want to know why...


----------



## vermaden (Jan 13, 2009)

randux said:
			
		

> We posted at the same time. No, it's a new E8400 core 2 duo box with 4g ram and it feels terrible on freebsd  Everything else runs great I want to know why...



These are mine results:


```
$ [b]time unixbench[/b]

(...)

  BYTE UNIX Benchmarks (Version 4.1.0)
  System -- mavio
  Start Benchmark Run: Tue Jan 13 18:29:05 CET 2009
   4 interactive users.
   6:29PM  up 8 days, 10:55, 4 users, load averages: 0.21, 0.29, 0.25
  -r-xr-xr-x  1 root  wheel  115292 Jan  1 12:49 /bin/sh
  /bin/sh: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), for FreeBSD 7.1, dynamically linked (uses shared libs), FreeBSD-style, stripped
  /dev/ad5s1e     7870554 5411380 1829530    75%    /usr
Dhrystone 2 using register variables     7561720.6 lps   (10.0 secs, 10 samples)
Double-Precision Whetstone                 1418.4 MWIPS (10.0 secs, 10 samples)
System Call Overhead                     397634.2 lps   (10.0 secs, 10 samples)
Pipe Throughput                          557754.4 lps   (10.0 secs, 10 samples)
Pipe-based Context Switching             114951.4 lps   (10.0 secs, 10 samples)
Process Creation                           6087.5 lps   (30.0 secs, 3 samples)
Execl Throughput                           1794.2 lps   (29.8 secs, 3 samples)
File Read 1024 bufsize 2000 maxblocks    503761.0 KBps  (30.0 secs, 3 samples)
File Write 1024 bufsize 2000 maxblocks   113417.0 KBps  (30.0 secs, 3 samples)
File Copy 1024 bufsize 2000 maxblocks     70656.0 KBps  (30.0 secs, 3 samples)
File Read 256 bufsize 500 maxblocks      134718.0 KBps  (30.0 secs, 3 samples)
File Write 256 bufsize 500 maxblocks      77655.0 KBps  (30.0 secs, 3 samples)
File Copy 256 bufsize 500 maxblocks       47665.0 KBps  (30.0 secs, 3 samples)
File Read 4096 bufsize 8000 maxblocks    1421800.0 KBps  (30.0 secs, 3 samples)
File Write 4096 bufsize 8000 maxblocks    46577.0 KBps  (30.0 secs, 3 samples)
File Copy 4096 bufsize 8000 maxblocks     56051.0 KBps  (30.0 secs, 3 samples)
Shell Scripts (1 concurrent)               2824.3 lpm   (60.0 secs, 3 samples)
Shell Scripts (8 concurrent)                580.0 lpm   (60.0 secs, 3 samples)
Shell Scripts (16 concurrent)               296.7 lpm   (60.0 secs, 3 samples)
Arithmetic Test (type = short)           1441867.5 lps   (10.0 secs, 3 samples)
Arithmetic Test (type = int)             1397100.8 lps   (10.0 secs, 3 samples)
Arithmetic Test (type = long)            1403966.3 lps   (10.0 secs, 3 samples)
Arithmetic Test (type = float)           582325.3 lps   (10.0 secs, 3 samples)
Arithmetic Test (type = double)          579875.4 lps   (10.0 secs, 3 samples)
Arithoh                                       nan lps   (10.0 secs, 3 samples)
C Compiler Throughput                      1443.3 lpm   (60.0 secs, 3 samples)
Dc: sqrt(2) to 99 decimal places          89183.8 lpm   (30.0 secs, 3 samples)
Recursion Test--Tower of Hanoi            80150.9 lps   (20.0 secs, 3 samples)


                     INDEX VALUES            
TEST                                        BASELINE     RESULT      INDEX

Dhrystone 2 using register variables        116700.0  7561720.6      648.0
Double-Precision Whetstone                      55.0     1418.4      257.9
Execl Throughput                                43.0     1794.2      417.3
File Copy 1024 bufsize 2000 maxblocks         3960.0    70656.0      178.4
File Copy 256 bufsize 500 maxblocks           1655.0    47665.0      288.0
File Copy 4096 bufsize 8000 maxblocks         5800.0    56051.0       96.6
Pipe Throughput                              12440.0   557754.4      448.4
Pipe-based Context Switching                  4000.0   114951.4      287.4
Process Creation                               126.0     6087.5      483.1
Shell Scripts (8 concurrent)                     6.0      580.0      966.7
System Call Overhead                         15000.0   397634.2      265.1
                                                                 =========
     FINAL SCORE                                                     332.7
unixbench  1191.09s user 1462.68s system 83% cpu 52:47.01 total
```

Its Core 2 Duo e6320 1.86GHz 4MB Cache + Intel Q35 + 2 x 1GB 800MHz RAM

With your 3.0GHz e8400 you should get something about 1.5-2 x of mine result.


----------



## randux (Jan 13, 2009)

I posted some benchmarks here: http://forums.freebsd.org/showthread.php?t=1427


----------



## randux (Jan 13, 2009)

It's almost exactly 2x of your result and most of the benchmarks look very good. But the system still _feels_ very slow and I don't know why.

I installed two new installs today, i386 and AMD64 both with softdeps turned on (I normally run with no softdeps on) and I ran my rarcrack benchmark and there was no change.


----------



## morbit (Feb 10, 2009)

Speaking of

dev.cpu.0.cx_lowest=C3
dev.cpu.1.cx_lowest=C2

My notebook has terminal bell stuttering problems and prolonged shutdown sequence if both cores are set to C3.


----------



## trev (Feb 21, 2009)

*bump* I've edited the topic "cpufreq for Phenoms and Opterons (AMD Family 10h)" above with a patch supplied by author against his PR submission in November last year.


----------



## Carpetsmoker (Mar 20, 2009)

vermaden said:
			
		

> Up to yesterday there was no option to set highest value to limit max CPU speed to save power or limit overheat, but Boris Kochergin wrote a patch to support also the highest limit with debug.cpufreq.highest oid:
> Code:
> sysctl debug.cpufreq.highest=1200



I've been thinking, and I wonder just how useful this is.

A 2GHz CPU running at 1GHz will not consume anywhere near half the power, while a CPU running at 2GHz will complete a task close to twice as fast.
The result is that the CPU will take more time to complete a task, and it will require more power in the end.

I haven't done any test, but I suspect that setting this value lower than the maximum will actually cause the battery to last shorter.


----------



## trev (May 30, 2009)

Talking of power savings, from a FreeBSD mailing list in November last year with an early beta version of the AMD 10h cpufreq patch *which I have updated above for 10h and now 11h CPUs*:



> cpu: Phenom 9350e quadcore 4x 2.0GHz (energy efficient 65W version, not black edition)
> mem: 1GB DDR2/666 CL5 (1 DIMM)
> mobo: Asus M3N72-D Socket AM2+ with nVidia 750a SLI chipset.
> hdd: 2,5" 40GB Hitachi notebook HDD on Parallel ATA (udma33)
> ...


----------



## morbit (May 30, 2009)

Mind you this is from CURRENT, however this is very nice thread about power saving:

http://lists.freebsd.org/pipermail/freebsd-current/2009-May/006436.html


----------



## vermaden (Jun 1, 2009)

morbit said:
			
		

> Mind you this is from CURRENT, however this is very nice thread about power saving:
> 
> http://lists.freebsd.org/pipermail/freebsd-current/2009-May/006436.html



Yes, great post, I have read it some time ago.


----------



## vermaden (Oct 27, 2009)

*Update:*

FreeBSD 8.0-RC1/RC2 does not offer as many frequency levels as 7.2-RELEASE, bug submitted:
http://freebsd.org/cgi/query-pr.cgi?pr=140010


----------



## richardpl (Oct 27, 2009)

Are you sure that you don't have some lines in loader.conf
What CPU is that, and what modules are loaded?
Better to post this on mailing list because PR may be ignored for a while.


----------



## vermaden (Oct 27, 2009)

@richardpl

You are right of course, it was that setting in /boot/loader.conf:
[cmd=]hint.acpi_throttle.0.disabled=1[/cmd], I will update "bug" info right now.


----------



## oliverh (Oct 27, 2009)

vermaden said:
			
		

> *Update:*
> 
> FreeBSD 8.0-RC1/RC2 does not offer as many frequency levels as 7.2-RELEASE, bug submitted:
> http://freebsd.org/cgi/query-pr.cgi?pr=140010



Hi vermaden

maybe it's not a bug, but a feature? Many of the shown frequency levels are more ore less nonsense on most cpus.


----------



## richardpl (Oct 27, 2009)

I use that one on loader.conf for following reasons:
1. acpi_throttle fails to attach on second core sometimes
2. acpi_throttle have very little power save gain and very big performance drop (when combined with wrong powerd flags)
3. it actually makes CPU just wait/halt - it doesnt put it in any "lower power state"
4. I really hate acpi


----------



## vermaden (Oct 27, 2009)

@oliverh

No mate, its just my fault because I set an option that "disabled" most of them:
http://forums.freebsd.org/showpost.php?p=46520&postcount=31

@richardpl

I added that option because of this post:
http://lists.freebsd.org/pipermail/freebsd-current/2009-May/006436.html

But I propably misread something, but thanks also for your reasons, it may be useful in the future.


----------



## trev (Jan 22, 2010)

*bump* I've edited the post "cpufreq for Phenoms and Opterons (AMD Family 10h)" above as a new file is now required for the current FreeBSD 7.2-STABLE source.


----------



## vermaden (Jan 22, 2010)

@trev

What about 8.0-RELEASE/8-STABLE, these changes have been already merged there?


----------



## trev (Jan 24, 2010)

vermaden said:
			
		

> What about 8.0-RELEASE/8-STABLE, these changes have been already merged there?



Not according to http://cameldung.org/man/index.cgi?...ropos=0&manpath=FreeBSD+8.0-RELEASE+and+Ports and http://cameldung.org/man/index.cgi?...tion=4&manpath=FreeBSD+8.0-stable&format=html which only show support for K7 and K8 (not K10 and K11).

I don't know for sure as I haven't upgraded the AMD box to 8 yet because 8.0-R and -S do not boot on my other system (Mac Mini, early 2009) and I like to keep them in sync (sharing same source tree).


----------



## royce (May 31, 2010)

Carpetsmoker said:
			
		

> I've been thinking, and I wonder just how useful this is.
> 
> A 2GHz CPU running at 1GHz will not consume anywhere near half the power, while a CPU running at 2GHz will complete a task close to twice as fast.
> The result is that the CPU will take more time to complete a task, and it will require more power in the end.
> ...



Dropping the CPU frequency can be useful for non-power-saving reasons.

I'm actually interested in debug.cpufreq.highest for underclocking a system with a bad fan (that I can't replace until later this week).


----------



## aragon (May 31, 2010)

royce said:
			
		

> I'm actually interested in debug.cpufreq.highest for underclocking a system with a bad fan (that I can't replace until later this week).


I suspect that's unnecessary.  Between Intel and AMD, I don't think any CPUs have been made without thermal protection in the last 5 years.  They limit the clock speed if over temperature by themselves...


----------



## Carpetsmoker (May 31, 2010)

Yes, they limit themselves by shutting down


----------



## chavez243ca (Nov 5, 2010)

*no savings...*

I enabled powerd on a Dell PE1900 - single Xeon quad-core, using flags


```
-a adaptive -b adaptive -i 90 -r 50 -p 100
```

When powerd was started, CPU was throttled down to ~600Mhz, but there was zero change in the overall power consumption of the server, as measured by the UPS it is connected to.  At idle I was still seeing a draw of ~105 watts.  I ran unixbench to get an idea of what full load draws, and that was ~140 watts peak.

Turning off powerd - I saw no change in consumption...

Any thoughts?


----------



## morbit (Nov 5, 2010)

Do you use C3 or deeper states? 

Check:


```
$ sysctl dev.cpu.0.cx_lowest
```

and


```
$ sysctl dev.cpu.0.cx_supported
```


```
$ sysctl dev.cpu.0.cx_usage
```


+

from acpi man:



> The acpi CPU idle power management drive conflicts with the local APIC
> (LAPIC) timer.  Disable the local APIC timer with hint.apic.0.clock=0 or
> do not use the C3 and deeper states if the local APIC timer is enabled.


----------



## chavez243ca (Nov 6, 2010)

Supported C-states are c1/0 c2/60


----------



## morbit (Nov 6, 2010)

set 


```
performance_cx_lowest="C2"
economy_cx_lowest="C2"
```

in /etc/rc.conf, and see if it changes power draw.

You can check C states usage from `$ sysctl dev.cpu.0.cx_usage`


----------



## chavez243ca (Nov 7, 2010)

oddly enough - making those changes has actually increased the consumption to an average of 140 watts.  UPS load went from 14% to 18%.  I did notice in BIOS though that performance based power consumption feature is read-only and set to disable - which is an indicator this Xeon lacks certain support.  Sysctl knobs do indicate the rc.conf settings did force it to use C2 states.  Does not make much sense - but it's likely not worth putting a bunch of work into, I have bigger fish to fry.


----------



## BertK88 (Dec 28, 2011)

chavez243ca said:
			
		

> oddly enough - making those changes has actually increased the consumption to an average of 140 watts.  UPS load went from 14% to 18%.  I did notice in BIOS though that performance based power consumption feature is read-only and set to disable - which is an indicator this Xeon lacks certain support.  Sysctl knobs do indicate the rc.conf settings did force it to use C2 states.  Does not make much sense - but it's likely not worth putting a bunch of work into, I have bigger fish to fry.



My Intel server has slightly better savings with:
/etc/rc.conf

```
performance_cx_lowest="C3"
economy_cx_lowest="C3"
```

than with:
/etc/rc.conf

```
powerd_enable="YES"
powerd_flags="-m 199 -M 2395 -a adaptive -n adaptive"
```

Using them together gives a little higher energy use.

Regards,

Bert


----------



## vermaden (Dec 29, 2011)

BertK88 said:
			
		

> Using them together gives a little higher energy use.


Then why not submit this as a BUG then?


----------



## aragon (Dec 30, 2011)

BertK88 said:
			
		

> Using them together gives a little higher energy use.



I suggest you try add the following to /boot/loader.conf:


```
hint.p4tcc.0.disabled="1"
hint.acpi_throttle.0.disabled="1"
```

I've seen more than one FreeBSD dev saying (more kindly than me) that P4TCC is useless, and I for one have seen one of my systems use more energy with it enabled.

YMMV


----------

