bcachefs.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2011-05-14	Revert "libata: ahci_start_engine compliant to AHCI spec"	Tejun Heo
	This reverts commit 270dac35c26433d06a89150c51e75ca0181ca7e4. The commits causes command timeouts on AC plug/unplug. It isn't yet clear why. As the commit was for a single rather obscure controller, revert the change for now. The problem was reported and bisected by Gu Rui in bug#34692. https://bugzilla.kernel.org/show_bug.cgi?id=34692 Also, reported by Rafael and Michael in the following thread. http://thread.gmane.org/gmane.linux.kernel/1138771 Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Gu Rui <chaos.proton@gmail.com> Reported-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-by: Michael Leun <lkml20100708@newton.leun.net> Cc: Jian Peng <jipeng2005@gmail.com> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-14	Further fbcon sanity checking	Bruno Prémont
	This moves the if (num_registered_fb == FB_MAX) return -ENXIO; check _AFTER_ the call to do_remove_conflicting_framebuffers() as this would (now in a safe way) allow a native driver to replace the conflicting one even if all slots in registered_fb[] are taken. This also prevents unregistering a framebuffer that is no longer registered (vga16f will unregister at module unload time even if the frame buffer had been unregistered earlier due to being found conflicting). Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-14	fbmem: fix remove_conflicting_framebuffers races	Linus Torvalds
	When a register_framebuffer() call results in us removing old conflicting framebuffers, the new registration_lock doesn't protect that situation. And we can't just add the same locking to the function, because these functions call each other: register_framebuffer() calls remove_conflicting_framebuffers, which in turn calls unregister_framebuffer for any conflicting entry. In order to fix it, this just creates wrapper functions around all three functions and makes the versions that actually do the work be called "do_xxx()", leaving just the wrapper that gets the lock and calls the worker function. So the rule becomes simply that "do_xxxx()" has to be called with the lock held, and now do_register_framebuffer() can just call do_remove_conflicting_framebuffers(), and that in turn can call _do_unregister_framebuffer(), and there is no deadlock, and we can hold the registration lock over the whole sequence, fixing the races. It also makes error cases simpler, and fixes one situation where we would return from unregister_framebuffer() without releasing the lock, pointed out by Bruno Prémont. Tested-by: Bruno Prémont <bonbons@linux-vserver.org> Tested-by: Anca Emanuel <anca.emanuel@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-13	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6: alpha: Wire up syscalls new to 2.6.39 alpha: convert to clocksource_register_hz
2011-05-13	alpha: Wire up syscalls new to 2.6.39	Michael Cree
	Wire up the syscalls: name_to_handle_at open_by_handle_at clock_adjtime syncfs and adjust some whitespace in the neighbourhood to align commments. Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-05-13	alpha: convert to clocksource_register_hz	John Stultz
	Converts alpha to use clocksource_register_hz. Signed-off-by: John Stultz <johnstul@us.ibm.com> CC: Richard Henderson <rth@twiddle.net> CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru> CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Matt Turner <mattst88@gmail.com>
2011-05-13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: bridge: fix forwarding of IPv6 bonding,llc: Fix structure sizeof incompatibility for some PDUs ipv6: restore correct ECN handling on TCP xmit ne-h8300: Fix regression caused during net_device_ops conversion hydra: Fix regression caused during net_device_ops conversion zorro8390: Fix regression caused during net_device_ops conversion sfc: Always map MCDI shared memory as uncacheable ehea: Fix memory hotplug oops libertas: fix cmdpendingq locking iwlegacy: fix IBSS mode crashes ath9k: Fix a warning due to a queued work during S3 state mac80211: don't start the dynamic ps timer if not associated
2011-05-13	Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6	Linus Torvalds
	* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFSv4.1: Ensure that layoutget uses the correct gfp modes NFSv4.1: remove pnfs_layout_hdr from pnfs_destroy_all_layouts tmp_list NFSv41: Resend on NFS4ERR_RETRY_UNCACHED_REP
2011-05-13	rbd: fix split bio handling	Yehuda Sadeh
	The rbd driver currently splits bios when they span an object boundary. However, the blk_end_request expects the completions to roll up the results in block device order, and the split rbd/ceph ops can complete in any order. This patch adds a struct rbd_req_coll to track completion of split requests and ensures that the results are passed back up to the block layer in order. This fixes errors where the file system gets completion of a read operation that spans an object boundary before the data has actually arrived. The bug is easily reproduced with iozone with a working set larger than available RAM. Reported-by: Fyodor Ustinov <ufm@ufm.su> Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-13	bridge: fix forwarding of IPv6	Stephen Hemminger
	The commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e bridge: Reset IPCB when entering IP stack on NF_FORWARD broke forwarding of IPV6 packets in bridge because it would call bp_parse_ip_options with an IPV6 packet. Reported-by: Noah Meyerhans <noahm@debian.org> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13	drm/i915: Revert i915.semaphore=1 default from i915 merge	Andy Lutomirski
	My Q67 / i7-2600 box has rev09 Sandy Bridge graphics. It hangs instantly when GNOME loads and it hangs so hard the reset button doesn't work. Setting i915.semaphore=0 fixes it. Semaphores were disabled in a1656b9090f7 ("drm/i915: Disable GPU semaphores by default") in 2.6.38 but were then re-enabled (by mistake?) by the merge 47ae63e0c2e5 ("Merge branch 'drm-intel-fixes' into drm-intel-next"). (It's worth noting that the offending change is i915_drv.c, which was not marked as a conflict - although a 'git show --cc' on the merge does show that neither parent had it set to 1) Signed-off-by: Andy Lutomirski <luto@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-13	bonding,llc: Fix structure sizeof incompatibility for some PDUs	Vitalii Demianets
	With some combinations of arch/compiler (e.g. arm-linux-gcc) the sizeof operator on structure returns value greater than expected. In cases when the structure is used for mapping PDU fields it may lead to unexpected results (such as holes and alignment problems in skb data). __packed prevents this undesired behavior. Signed-off-by: Vitalii Demianets <vitas@nppfactor.kiev.ua> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13	vfs: micro-optimize acl_permission_check()	Linus Torvalds
	It's a hot function, and we're better off not mixing types in the mask calculations. The compiler just ends up mixing 16-bit and 32-bit operations, for no good reason. So do everything in 'unsigned int' rather than mixing 'unsigned int' masking with a 'umode_t' (16-bit) mode variable. This, together with the parent commit (47a150edc2ae: "Cache user_ns in struct cred") makes acl_permission_check() much nicer. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-13	Cache user_ns in struct cred	Serge E. Hallyn
	If !CONFIG_USERNS, have current_user_ns() defined to (&init_user_ns). Get rid of _current_user_ns. This requires nsown_capable() to be defined in capability.c rather than as static inline in capability.h, so do that. Request_key needs init_user_ns defined at current_user_ns if !CONFIG_USERNS, so forward-declare that in cred.h if !CONFIG_USERNS at current_user_ns() define. Compile-tested with and without CONFIG_USERNS. Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> [ This makes a huge performance difference for acl_permission_check(), up to 30%. And that is one of the hottest kernel functions for loads that are pathname-lookup heavy. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-13	x86, mce, AMD: Fix leaving freed data in a list	Julia Lawall
	b may be added to a list, but is not removed before being freed in the case of an error. This is done in the corresponding deallocation function, so the code here has been changed to follow that. The sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression E,E1,E2; identifier l; @@ list_add(&E->l,E1); ... when != E1 when != list_del(&E->l) when != list_del_init(&E->l) when != E = E2 kfree(E);// </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/1305294731-12127-1-git-send-email-julia@diku.dk Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-13	OMAP3: set the core dpll clk rate in its set_rate function	Avinash H.M
	The debug l3_ick/rate is not displaying the actual rate of the clock in hardware. This is because, the core dpll set_rate function doesn't update the clk.rate. After fixing, the l3_ick/rate is displaying proper values. Signed-off-by: Shweta Gulati <shweta.gulati@ti.com> Signed-off-by: Avinash.H.M <avinashhm@ti.com> Cc: Rajendra Nayak <rnayak@ti.com> Cc: Paul Wamsley <paul@pwsan.com> Acked-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Tony Lindgren <tony@atomide.com>
2011-05-13	drm/radeon/kms: add some evergreen/ni safe regs	Alex Deucher
	need to programmed from the userspace drivers. Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-05-13	drm/radeon/kms: fix extended lvds info parsing	Alex Deucher
	On rev <= 1.1 tables, the offset is absolute, on newer tables, it's relative. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=700326 Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-05-13	drm/radeon/kms: fix tiling reg on fusion	Alex Deucher
	The location of MC_ARB_RAMCFG changed on fusion. I've diffed all the other regs in evergreend.h and this is the only other reg that changed. Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-05-12	rbd: fix leak of ops struct	Sage Weil
	The ops vector must be freed by the rbd_do_request caller. Signed-off-by: Sage Weil <sage@newdream.net>
2011-05-12	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: SELinux: delete debugging printks from filename_trans rule processing
2011-05-12	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: net/9p/protocol.c: Fix a memory leak
2011-05-12	Merge branch 'for-2639-rc7/i2c-fixes' of git://git.fluff.org/bjdooks/linux	Linus Torvalds
	* 'for-2639-rc7/i2c-fixes' of git://git.fluff.org/bjdooks/linux: i2c: pnx: Fix crash due to wrong init of timer->data
2011-05-13	Merge branch 'for-linus' of git://git.infradead.org/users/eparis/selinux ↵	James Morris
	into for-linus
2011-05-13	i2c: pnx: Fix crash due to wrong init of timer->data	Wolfram Sang
	alg_data is already a pointer which must be passed directly. Reported-by: Dieter Ripp <ripp@systecnet.com> Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ben Dooks <ben-i2c@fluff.org> Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2011-05-12	ipv6: restore correct ECN handling on TCP xmit	Steinar H. Gunderson
	Since commit e9df2e8fd8fbc9 (Use appropriate sock tclass setting for routing lookup) we lost ability to properly add ECN codemarks to ipv6 TCP frames. It seems like TCP_ECN_send() calls INET_ECN_xmit(), which only sets the ECN bit in the IPv4 ToS field (inet_sk(sk)->tos), but after the patch, what's checked is inet6_sk(sk)->tclass, which is a completely different field. Close bug https://bugzilla.kernel.org/show_bug.cgi?id=34322 [Eric Dumazet] : added the INET_ECN_dontxmit() fix and replace macros by inline functions for clarity. Signed-off-by: Steinar H. Gunderson <sgunderson@bigfoot.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12	vsprintf: Turn kptr_restrict off by default	Ingo Molnar
	kptr_restrict has been triggering bugs in apps such as perf, and it also makes the system less useful by default, so turn it off by default. This is how we generally handle security features that remove functionality, such as firewall code or SELinux - they have to be configured and activated from user-space. Distributions can turn kptr_restrict on again via this line in /etc/sysctrl.conf: kernel.kptr_restrict = 1 ( Also mark the variable __read_mostly while at it, as it's typically modified only once per bootup, or not at all. ) Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-12	net/9p/protocol.c: Fix a memory leak	Pedro Scarapicchia Junior
	When p9pdu_readf() is called with "s" attribute, it allocates a pointer that will store a string. In p9dirent_read(), this pointer is not being released, leading to out of memory errors. This patch releases this pointer after string is copyed to dirent->d_name. Signed-off-by: Pedro Scarapicchia Junior <pedro.scarapiccha@br.flextronics.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-05-12	x86: Fix UV BAU for non-consecutive nasids	Cliff Wickman
	This is a fix for the SGI Altix-UV Broadcast Assist Unit code, which is used for TLB flushing. Certain hardware configurations (that customers are ordering) cause nasids (numa address space id's) to be non-consecutive. Specifically, once you have more than 4 blades in a IRU (Individual Rack Unit - or 1/2 rack) but less than the maximum of 16, the nasid numbering becomes non-consecutive. This currently results in a 'catastrophic error' (CATERR) detected by the firmware during OS boot. The BAU is generating an 'INTD' request that is targeting a non-existent nasid value. Such configurations may also occur when a blade is configured off because of hardware errors. (There is one UV hub per blade.) This patch is required to support such configurations. The problem with the tlb_uv.c code is that is using the consecutive hub numbers as indices to the BAU distribution bit map. These are simply the ordinal position of the hub or blade within its partition. It should be using physical node numbers (pnodes), which correspond to the physical nasid values. Use of the hub number only works as long as the nasids in the partition are consecutive and increase with a stride of 1. This patch changes the index to be the pnode number, thus allowing nasids to be non-consecutive. It also provides a table in local memory for each cpu to translate target cpu number to target pnode and nasid. And it improves naming to properly reflect 'node' and 'uvhub' versus 'nasid'. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/E1QJmxX-0002Mz-Fk@eag09.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-12	ne-h8300: Fix regression caused during net_device_ops conversion	Geert Uytterhoeven
	Changeset dcd39c90290297f6e6ed8a04bb20da7ac2b043c5 ("ne-h8300: convert to net_device_ops") broke ne-h8300 by adding 8390.o to the link. That meant that lib8390.c was included twice, once in ne-h8300.c and once in 8390.c, subject to different macros. This patch reverts that by avoiding the wrappers in 8390.c. Fix based on commits 217cbfa856dc1cbc2890781626c4032d9e3ec59f ("mac8390: fix regression caused during net_device_ops conversion") and 4e0168fa4842e27795a75b205a510f25b62181d9 ("mac8390: fix build with NET_POLL_CONTROLLER"). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: stable@kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12	hydra: Fix regression caused during net_device_ops conversion	Geert Uytterhoeven
	Changeset 5618f0d1193d6b051da9b59b0e32ad24397f06a4 ("hydra: convert to net_device_ops") broke hydra by adding 8390.o to the link. That meant that lib8390.c was included twice, once in hydra.c and once in 8390.c, subject to different macros. This patch reverts that by avoiding the wrappers in 8390.c. Fix based on commits 217cbfa856dc1cbc2890781626c4032d9e3ec59f ("mac8390: fix regression caused during net_device_ops conversion") and 4e0168fa4842e27795a75b205a510f25b62181d9 ("mac8390: fix build with NET_POLL_CONTROLLER"). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: stable@kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12	zorro8390: Fix regression caused during net_device_ops conversion	Geert Uytterhoeven
	Changeset b6114794a1c394534659f4a17420e48cf23aa922 ("zorro8390: convert to net_device_ops") broke zorro8390 by adding 8390.o to the link. That meant that lib8390.c was included twice, once in zorro8390.c and once in 8390.c, subject to different macros. This patch reverts that by avoiding the wrappers in 8390.c. Fix based on commits 217cbfa856dc1cbc2890781626c4032d9e3ec59f ("mac8390: fix regression caused during net_device_ops conversion") and 4e0168fa4842e27795a75b205a510f25b62181d9 ("mac8390: fix build with NET_POLL_CONTROLLER"). Reported-by: Christian T. Steigies <cts@debian.org> Suggested-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Christian T. Steigies <cts@debian.org> Cc: stable@kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12	Merge branch 'sfc-2.6.39' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-2.6
2011-05-12	SELinux: delete debugging printks from filename_trans rule processing	Eric Paris
	The filename_trans rule processing has some printk(KERN_ERR ) messages which were intended as debug aids in creating the code but weren't removed before it was submitted. Remove them. Reported-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Eric Paris <eparis@redhat.com>
2011-05-12	Merge branch 'fix/asoc' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ASoC: WM8903: Fix Digital Capture Volume range ASoC: UDA134x: Remove POWER_OFF_ON_STANDBY define. ASoC: SSM2602: Fix reg_cache_size ASoC: SSM2602: Fix 'Mic Boost2' control ASoC: SSM2602: Properly annotate i2c probe and remove functions ASoC: sst_platform: add hw_free callback to fix resource leak ASoC: Don't crash on PM operations ASoC: JZ4740: Fix i2s shutdown
2011-05-12	Merge branch 'stable/bug-fixes-for-rc7' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/bug-fixes-for-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: x86/mm: Fix section mismatch derived from native_pagetable_reserve() x86,xen: introduce x86_init.mapping.pagetable_reserve Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high""
2011-05-12	Revert "drm/i915: Only enable the plane after setting the fb base (pre-ILK)"	Linus Torvalds
	This reverts commit 49183b2818de6899383bb82bc032f9344d6791ff. Quoth Franz Melchior: "This patch introduces a bug on my infamous "Acer Travelmate 5735Z-452G32Mnss": when KMS takes over, the frame buffer contents get completely garbled up on screen, with colored stripes and unreadable text (photo on request). Only when X11 is started, the screen gets restored again. Closing and re-opening the lid partly cures the mess, too: it makes the font readable, though horizontally stretched." Acked-by: Keith Packard <keithp@keithp.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-12	Merge branch 'fbmem'	Linus Torvalds
	* fbmem: fbmem: make read/write/ioctl use the frame buffer at open time fbcon: add lifetime refcount to opened frame buffers
2011-05-12	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: ads7846 - remove unused variable from struct ads7845_ser_req Input: ads7846 - make transfer buffers DMA safe
2011-05-12	x86/mm: Fix section mismatch derived from native_pagetable_reserve()	Sedat Dilek
	With CONFIG_DEBUG_SECTION_MISMATCH=y I see these warnings in next-20110415: LD vmlinux.o MODPOST vmlinux.o WARNING: vmlinux.o(.text+0x1ba48): Section mismatch in reference from the function native_pagetable_reserve() to the function .init.text:memblock_x86_reserve_range() The function native_pagetable_reserve() references the function __init memblock_x86_reserve_range(). This is often because native_pagetable_reserve lacks a __init annotation or the annotation of memblock_x86_reserve_range is wrong. This patch fixes the issue. Thanks to pipacs from PaX project for help on IRC. Acked-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12	x86,xen: introduce x86_init.mapping.pagetable_reserve	Stefano Stabellini
	Introduce a new x86_init hook called pagetable_reserve that at the end of init_memory_mapping is used to reserve a range of memory addresses for the kernel pagetable pages we used and free the other ones. On native it just calls memblock_x86_reserve_range while on xen it also takes care of setting the spare memory previously allocated for kernel pagetable pages from RO to RW, so that it can be used for other purposes. A detailed explanation of the reason why this hook is needed follows. As a consequence of the commit: commit 4b239f458c229de044d6905c2b0f9fe16ed9e01e Author: Yinghai Lu <yinghai@kernel.org> Date: Fri Dec 17 16:58:28 2010 -0800 x86-64, mm: Put early page table high at some point init_memory_mapping is going to reach the pagetable pages area and map those pages too (mapping them as normal memory that falls in the range of addresses passed to init_memory_mapping as argument). Some of those pages are already pagetable pages (they are in the range pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and everything is fine. Some of these pages are not pagetable pages yet (they fall in the range pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they are going to be mapped RW. When these pages become pagetable pages and are hooked into the pagetable, xen will find that the guest has already a RW mapping of them somewhere and fail the operation. The reason Xen requires pagetables to be RO is that the hypervisor needs to verify that the pagetables are valid before using them. The validation operations are called "pinning" (more details in arch/x86/xen/mmu.c). In order to fix the issue we mark all the pages in the entire range pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation is completed only the range pgt_buf_start-pgt_buf_end is reserved by init_memory_mapping. Hence the kernel is going to crash as soon as one of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those ranges are RO). For this reason we need a hook to reserve the kernel pagetable pages we used and free the other ones so that they can be reused for other purposes. On native it just means calling memblock_x86_reserve_range, on Xen it also means marking RW the pagetable pages that we allocated before but that haven't been used before. Another way to fix this is without using the hook is by adding a 'if (xen_pv_domain)' in the 'init_memory_mapping' code and calling the Xen counterpart, but that is just nasty. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12	Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high""	Konrad Rzeszutek Wilk
	This reverts commit a38647837a411f7df79623128421eef2118b5884. It does not work with certain AMD machines. last_pfn = 0x100000 max_arch_pfn = 0x400000000 initial memory mapped : 0 - 02c3a000 Base memory trampoline at [ffff88000009b000] 9b000 size 20480 init_memory_mapping: 0000000000000000-0000000100000000 0000000000 - 0100000000 page 4k kernel direct mapping tables up to 100000000 @ ff7fb000-100000000 init_memory_mapping: 0000000100000000-00000001e0800000 0100000000 - 01e0800000 page 4k kernel direct mapping tables up to 1e0800000 @ 1df0f3000-1e0000000 xen: setting RW the range fffdc000 - 100000000 RAMDISK: 0203b000 - 02c3a000 No NUMA configuration found Faking a node at 0000000000000000-00000001e0800000 NUMA: Using 63 for the hash shift. Initmem setup node 0 0000000000000000-00000001e0800000 NODE_DATA [00000001dfffb000 - 00000001dfffffff] BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea PGD 0 Oops: 0003 [#1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.39-0-virtual #6~smb1 RIP: e030:[<ffffffff81cf6a75>] [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea RSP: e02b:ffffffff81c01e38 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 00000001e0800000 RCX: 0000000000001040 RDX: 0000000000004100 RSI: 0000000000000000 RDI: ffff8801dfffb000 RBP: ffffffff81c01e58 R08: 0000000000000020 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000bfe400 FS: 0000000000000000(0000) GS:ffffffff81cca000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c03000 CR4: 0000000000000660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81c00000, task ffffffff81c0b020) Stack: 0000000000000040 0000000000000001 0000000000000000 ffffffffffffffff ffffffff81c01e88 ffffffff81cf6c25 0000000000000000 0000000000000000 ffffffff81cf687f 0000000000000000 ffffffff81c01ea8 ffffffff81cf6e45 Call Trace: [<ffffffff81cf6c25>] numa_register_memblks.constprop.3+0x150/0x181 [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c [<ffffffff81cf6e45>] numa_init.part.2+0x1c/0x7c [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c [<ffffffff81cf6f67>] numa_init+0x6c/0x70 [<ffffffff81cf7057>] initmem_init+0x39/0x3b [<ffffffff81ce5865>] setup_arch+0x64e/0x769 [<ffffffff815e43c1>] ? printk+0x51/0x53 [<ffffffff81cdf92b>] start_kernel+0xd4/0x3f3 [<ffffffff81cdf388>] x86_64_start_reservations+0x132/0x136 [<ffffffff81ce2ed4>] xen_start_kernel+0x588/0x58f Code: 41 00 00 48 8b 3c c5 a0 24 cc 81 31 c0 40 f6 c7 01 74 05 aa 66 ba ff 40 40 f6 c7 02 74 05 66 ab 83 ea 02 89 d1 c1 e9 02 f6 c2 02 <f3> ab 74 02 66 ab 80 e2 01 74 01 aa 49 63 c4 48 c1 eb 0c 44 89 RIP [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea RSP <ffffffff81c01e38> CR2: 0000000000000000 ---[ end trace a7919e7f17c0a725 ]--- Kernel panic - not syncing: Attempted to kill the idle task! Pid: 0, comm: swapper Tainted: G D 2.6.39-0-virtual #6~smb1 Reported-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12	Merge branch 'for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix oops in revalidate when called with NULL nameidata
2011-05-12	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc32: Fixed unaligned memory copying in function __csum_partial_copy_sparc_generic sparc32: fix sparcstation 5 boot sparc32: fix section mismatch warnings in apc, pmc and time_32
2011-05-12	Merge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm	Linus Torvalds
	* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm: ARM: 6870/1: The mandatory barrier rmb() must be a dsb() in for device accesses ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls ARM: 6890/1: memmap: only free allocated memmap entries when using SPARSEMEM ARM: zImage: the page table memory must be considered before relocation ARM: zImage: make sure not to relocate on top of the relocation code ARM: zImage: Fix bad SP address after relocating kernel ARM: zImage: make sure the stack is 64-bit aligned ARM: RiscPC: acornfb: fix section mismatches ARM: RiscPC: etherh: fix section mismatches
2011-05-12	fbmem: make read/write/ioctl use the frame buffer at open time	Linus Torvalds
	read/write/ioctl on a fbcon file descriptor has traditionally used the fbcon not when it was opened, but as it was at the time of the call. That makes no sense, but the lack of sense is much more obvious now that we properly ref-count the usage - it means that the ref-counting doesn't actually protect operations we do on the frame buffer. This changes it to look at the fb_info that we got at open time, but in order to avoid using a frame buffer long after it has been unregistered, we do verify that it is still current, and return -ENODEV if not. Acked-by: Tim Gardner <tim.gardner@canonical.com> Tested-by: Daniel J Blueman <daniel.blueman@gmail.com> Tested-by: Anca Emanuel <anca.emanuel@gmail.com> Cc: Bruno Prémont <bonbons@linux-vserver.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Dave Airlie <airlied@redhat.com> Cc: Andy Whitcroft <andy.whitcroft@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-12	fbcon: add lifetime refcount to opened frame buffers	Linus Torvalds
	This just adds the refcount and the new registration lock logic. It does not (for example) actually change the read/write/ioctl routines to actually use the frame buffer that was opened: those function still end up alway susing whatever the current frame buffer is at the time of the call. Without this, if something holds the frame buffer open over a framebuffer switch, the close() operation after the switch will access a fb_info that has been free'd by the unregistering of the old frame buffer. (The read/write/ioctl operations will normally not cause problems, because they will - illogically - pick up the new fbcon instead. But a switch that happens just as one of those is going on might see problems too, the window is just much smaller: one individual op rather than the whole open-close sequence.) This use-after-free is apparently fairly easily triggered by the Ubuntu 11.04 boot sequence. Acked-by: Tim Gardner <tim.gardner@canonical.com> Tested-by: Daniel J Blueman <daniel.blueman@gmail.com> Tested-by: Anca Emanuel <anca.emanuel@gmail.com> Cc: Bruno Prémont <bonbons@linux-vserver.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Dave Airlie <airlied@redhat.com> Cc: Andy Whitcroft <andy.whitcroft@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-12	sfc: Always map MCDI shared memory as uncacheable	Ben Hutchings
	We enabled write-combining for memory-mapped registers in commit 65f0b417dee94f779ce9b77102b7d73c93723b39, but inhibited it for the MCDI shared memory where this is not supported. However, write-combining mappings also allow read-reordering, which may also be a problem. I found that when an SFC9000-family controller is connected to an Intel 3000 chipset, and write-combining is enabled, the controller stops responding to PCIe read requests during driver initialisation while the driver is polling for completion of an MCDI command. This results in an NMI and system hang. Adding read memory barriers between all reads to the shared memory area appears to reduce but not eliminate the probability of this. We have not yet established whether this is a bug in our BIU or in the PCIe bridge. For now, work around by mapping the shared memory area separately. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2011-05-12	ARM: 6870/1: The mandatory barrier rmb() must be a dsb() in for device accesses	Catalin Marinas
	Since mandatory barriers may be used (explicitly or implicitly via readl etc.) to ensure the ordering between Device and Normal memory accesses, a DMB is not enough. This patch converts it to a DSB. Cc: Colin Cross <ccross@android.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-05-12	ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls	Arnd Bergmann
	GDB's interrupt.exp test cases currenly fail on ARM. The problem is how do_signal handled restarting interrupted system calls: The entry.S assembler code determines that we come from a system call; and that information is passed as "syscall" parameter to do_signal. That routine then calls get_signal_to_deliver [] and if a signal is to be delivered, calls into handle_signal. If a system call is to be restarted either after the signal handler returns, or if no handler is to be called in the first place, the PC is updated after the get_signal_to_deliver call, either in handle_signal (if we have a handler) or at the end of do_signal (otherwise). Now the problem is that during [], the call to get_signal_to_deliver, a ptrace intercept may happen. During this intercept, the debugger may change registers, including the PC. This is done by GDB if it wants to execute an "inferior call", i.e. the execution of some code in the debugged program triggered by GDB. To this purpose, GDB will save all registers, allocate a stack frame, set up PC and arguments as appropriate for the call, and point the link register to a dummy breakpoint instruction. Once the process is restarted, it will execute the call and then trap back to the debugger, at which point GDB will restore all registers and continue original execution. This generally works fine. However, now consider what happens when GDB attempts to do exactly that while the process was interrupted during execution of a to-be- restarted system call: do_signal is called with the syscall flag set; it calls get_signal_to_deliver, at which point the debugger takes over and changes the PC to point to a completely different place. Now get_signal_to_deliver returns without a signal to deliver; but now do_signal decides it should be restarting a system call, and decrements the PC by 2 or 4 -- so it now points to 2 or 4 bytes before the function GDB wants to call -- which leads to a subsequent crash. To fix this problem, two things need to be supported: - do_signal must be able to recognize that get_signal_to_deliver changed the PC to a different location, and skip the restart-syscall sequence - once the debugger has restored all registers at the end of the inferior call sequence, do_signal must recognize that now it needs to restart the pending system call, even though it was now entered from a breakpoint instead of an actual svc instruction This set of issues is solved on other platforms, usually by one of two mechanisms: - The status information "do_signal is handling a system call that may need restarting" is itself carried in some register that can be accessed via ptrace. This is e.g. on Intel the "orig_eax" register; on Sparc the kernel defines a magic extra bit in the flags register for this purpose. This allows GDB to manage that state: reset it when doing an inferior call, and restore it after the call is finished. - On s390, do_signal transparently handles this problem without requiring GDB interaction, by performing system call restarting in the following way: first, adjust the PC as necessary for restarting the call. Then, call get_signal_to_deliver; and finally just continue execution at the PC. This way, if GDB does not change the PC, everything is as before. If GDB does change the PC, execution will simply continue there -- and once GDB restores the PC it saved at that point, it will automatically point to the restarted system call. (There is the minor twist how to handle system calls that do not need restarting -- do_signal will undo the PC change in this case, after get_signal_to_deliver has returned, and only if ptrace did not change the PC during that call.) Because there does not appear to be any obvious register to carry the syscall-restart information on ARM, we'd either have to introduce a new artificial ptrace register just for that purpose, or else handle the issue transparently like on s390. The patch below implements the second option; using this patch makes the interrupt.exp test cases pass on ARM, with no regression in the GDB test suite otherwise. Cc: patches@linaro.org Signed-off-by: Ulrich Weigand <ulrich.weigand@linaro.org> Signed-off-by: Arnd Bergmann <arnd.bergmann@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>